Publicação

Conformal prediction of used car prices

Ver documento

Detalhes bibliográficos
Resumo:This academic thesis addresses a critical gap in the existing literature surrounding predictive analytics and used car prices, specifically where research predominantly focuses on estimating point predictions of prices using machine learning without providing a measure of uncertainty associated with these predictions. The objective is to calculate prediction intervals using both conformal quantile regression and frequentist quantile regression on a “Light Gradient Boosting Machine (LightGBM)” model trained with a comprehensive dataset of used car listings collected in May 2021 from the United States marketplace. The paper empirically compares these two methodologies at various nominal coverage probabilities. Notably, the study reveals a significant trade-off that decision-makers must consider—a balance between accuracy and precision. Conformal predictions uniquely offer a guarantee of the nominal coverage level at the expense of wider prediction intervals. Furthermore, the research emphasizes that the decision on which method to use depends on the target nominal coverage probability level. As the nominal coverage probability increases, the study finds that the median width of conformal quantile regression increases more than proportionally compared to frequentist quantile regression. This implies that the coverage guarantee becomes more costly in terms of width as the nominal coverage probability rises, making conformal quantile regression more advantageous at lower nominal coverage probability
Autores principais:Besi, Edoardo de
Assunto:Car Pricing Machine Learning Conformal Prediction Conformal Quantile Regression
Ano:2023
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade de Lisboa
Idioma:inglês
Origem:Repositório da Universidade de Lisboa
Descrição
Resumo:This academic thesis addresses a critical gap in the existing literature surrounding predictive analytics and used car prices, specifically where research predominantly focuses on estimating point predictions of prices using machine learning without providing a measure of uncertainty associated with these predictions. The objective is to calculate prediction intervals using both conformal quantile regression and frequentist quantile regression on a “Light Gradient Boosting Machine (LightGBM)” model trained with a comprehensive dataset of used car listings collected in May 2021 from the United States marketplace. The paper empirically compares these two methodologies at various nominal coverage probabilities. Notably, the study reveals a significant trade-off that decision-makers must consider—a balance between accuracy and precision. Conformal predictions uniquely offer a guarantee of the nominal coverage level at the expense of wider prediction intervals. Furthermore, the research emphasizes that the decision on which method to use depends on the target nominal coverage probability level. As the nominal coverage probability increases, the study finds that the median width of conformal quantile regression increases more than proportionally compared to frequentist quantile regression. This implies that the coverage guarantee becomes more costly in terms of width as the nominal coverage probability rises, making conformal quantile regression more advantageous at lower nominal coverage probability