Publicação

Estimation of relapse probability in early stages non-small cell lung cancer patients

Ver documento

Detalhes bibliográficos
Resumo:Lung cancer is Europe’s third most prevalent cancer in women and men’s second most common cancer. With an expected 1.8 million deaths in 2020, lung cancer remains the leading cause of cancer mortality worldwide. It is estimated that this number will increase in the coming years, causing alarm among global health organisations attempting to prevent this tendency. Even though improvements in early diagnosis and treatment have been made in the hope of increasing survival, recurrence remains a significant problem. Between 30% and 70% of patients with early-stage lung cancer who undergo surgery end up experiencing a relapse. A promising strategy is to leverage data in electronic health records with machine learning algorithms to produce a more reliable risk stratification and identify better the patient’s propensity to relapse, improving survival rates and enhancing patient quality of life. For this purpose, this research developed three logistic regression models to predict recurrence in early-stage NSCLC patients in time horizons of one year, three years, and five years following surgery. After understanding the dataset’s content, a descriptive analysis of the dataset follows, where each attribute used in the models is described. It also explains the logistic regression, the K-fold Cross-Validation method and the concept of relevant metrics to assess the models’ performance. Finally, the implementation and the results of the produced models are presented. The first year following the surgery model produced an accuracy of 91.65%, while the three-year and five-year models achieved 89.71% and 89.94%, respectively. Regarding AUC values, the results were 91.65%, 89.16%, and 90.23% for the one-year, three-year, and five-year models, respectively. This dissertation was conducted with the collaboration of the University Hospital Puerta Hierro de Majadahonda’s oncology department within the European project CLARIFY.
Autores principais:Pardal, Mariana Raimundo
Assunto:Non-Small Cell Lung Cancers Machine Learning Logistic Regression Probability of relapse
Ano:2022
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Lung cancer is Europe’s third most prevalent cancer in women and men’s second most common cancer. With an expected 1.8 million deaths in 2020, lung cancer remains the leading cause of cancer mortality worldwide. It is estimated that this number will increase in the coming years, causing alarm among global health organisations attempting to prevent this tendency. Even though improvements in early diagnosis and treatment have been made in the hope of increasing survival, recurrence remains a significant problem. Between 30% and 70% of patients with early-stage lung cancer who undergo surgery end up experiencing a relapse. A promising strategy is to leverage data in electronic health records with machine learning algorithms to produce a more reliable risk stratification and identify better the patient’s propensity to relapse, improving survival rates and enhancing patient quality of life. For this purpose, this research developed three logistic regression models to predict recurrence in early-stage NSCLC patients in time horizons of one year, three years, and five years following surgery. After understanding the dataset’s content, a descriptive analysis of the dataset follows, where each attribute used in the models is described. It also explains the logistic regression, the K-fold Cross-Validation method and the concept of relevant metrics to assess the models’ performance. Finally, the implementation and the results of the produced models are presented. The first year following the surgery model produced an accuracy of 91.65%, while the three-year and five-year models achieved 89.71% and 89.94%, respectively. Regarding AUC values, the results were 91.65%, 89.16%, and 90.23% for the one-year, three-year, and five-year models, respectively. This dissertation was conducted with the collaboration of the University Hospital Puerta Hierro de Majadahonda’s oncology department within the European project CLARIFY.