Publicação
A Machine Learning Approach to Predict Health Insurance Claims
| Resumo: | Renewing health insurance contracts is, usually, an annual process, in the Tailor Made policies branch. At Multicare, this process starts three months before the end of the clients’ annuity, by estimating the costs of the last quarter using information from the first three. This estimation process is critical to the renewal of insurance contracts, since, if the estimation is too high, the client will overpay for their insurance and might seek more competitive alternatives. In contrast, if the predictions are too low, it will result in losses for the company. This part of the renewal process is currently performed by a time series algorithm, specifically an ARIMA model. This project aims to build a machine learning-based model that will provide more accurate estimations of the claims’ cost and frequency, in the Inpatient coverage, to Multicare. Several algorithms were tested: Linear and Logistic Regressions, Decision Trees, Random Forests, Gradient Boosting and XGBoost; and their results were then compared to the ones of the current ARIMA model. This study showed that a machine learning technique, the XGBoost, is more powerful than the ARIMA, as it projects 9% above the real costs, against the ARIMA’s global error of -25%. These conclusions can lead to changes in Multicare’s approach to predicting claim costs and, consequentially, its way of doing business. |
|---|---|
| Autores principais: | Cordeiro, Miguel Filipe Martins |
| Assunto: | Claim Forecasting Ensemble Health Insurance Machine Learning Tailor Made Policies XGBoost |
| Ano: | 2023 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Renewing health insurance contracts is, usually, an annual process, in the Tailor Made policies branch. At Multicare, this process starts three months before the end of the clients’ annuity, by estimating the costs of the last quarter using information from the first three. This estimation process is critical to the renewal of insurance contracts, since, if the estimation is too high, the client will overpay for their insurance and might seek more competitive alternatives. In contrast, if the predictions are too low, it will result in losses for the company. This part of the renewal process is currently performed by a time series algorithm, specifically an ARIMA model. This project aims to build a machine learning-based model that will provide more accurate estimations of the claims’ cost and frequency, in the Inpatient coverage, to Multicare. Several algorithms were tested: Linear and Logistic Regressions, Decision Trees, Random Forests, Gradient Boosting and XGBoost; and their results were then compared to the ones of the current ARIMA model. This study showed that a machine learning technique, the XGBoost, is more powerful than the ARIMA, as it projects 9% above the real costs, against the ARIMA’s global error of -25%. These conclusions can lead to changes in Multicare’s approach to predicting claim costs and, consequentially, its way of doing business. |
|---|