Publicação

A Machine Learning Approach to Predict Health Insurance Claims

Ver documento

Detalhes bibliográficos
Resumo:Renewing health insurance contracts is, usually, an annual process, in the Tailor Made policies branch. At Multicare, this process starts three months before the end of the clients’ annuity, by estimating the costs of the last quarter using information from the first three. This estimation process is critical to the renewal of insurance contracts, since, if the estimation is too high, the client will overpay for their insurance and might seek more competitive alternatives. In contrast, if the predictions are too low, it will result in losses for the company. This part of the renewal process is currently performed by a time series algorithm, specifically an ARIMA model. This project aims to build a machine learning-based model that will provide more accurate estimations of the claims’ cost and frequency, in the Inpatient coverage, to Multicare. Several algorithms were tested: Linear and Logistic Regressions, Decision Trees, Random Forests, Gradient Boosting and XGBoost; and their results were then compared to the ones of the current ARIMA model. This study showed that a machine learning technique, the XGBoost, is more powerful than the ARIMA, as it projects 9% above the real costs, against the ARIMA’s global error of -25%. These conclusions can lead to changes in Multicare’s approach to predicting claim costs and, consequentially, its way of doing business.
Autores principais:Cordeiro, Miguel Filipe Martins
Assunto:Claim Forecasting Ensemble Health Insurance Machine Learning Tailor Made Policies XGBoost
Ano:2023
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Renewing health insurance contracts is, usually, an annual process, in the Tailor Made policies branch. At Multicare, this process starts three months before the end of the clients’ annuity, by estimating the costs of the last quarter using information from the first three. This estimation process is critical to the renewal of insurance contracts, since, if the estimation is too high, the client will overpay for their insurance and might seek more competitive alternatives. In contrast, if the predictions are too low, it will result in losses for the company. This part of the renewal process is currently performed by a time series algorithm, specifically an ARIMA model. This project aims to build a machine learning-based model that will provide more accurate estimations of the claims’ cost and frequency, in the Inpatient coverage, to Multicare. Several algorithms were tested: Linear and Logistic Regressions, Decision Trees, Random Forests, Gradient Boosting and XGBoost; and their results were then compared to the ones of the current ARIMA model. This study showed that a machine learning technique, the XGBoost, is more powerful than the ARIMA, as it projects 9% above the real costs, against the ARIMA’s global error of -25%. These conclusions can lead to changes in Multicare’s approach to predicting claim costs and, consequentially, its way of doing business.