Publication

Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia

Bibliographic Details
Summary:	This dissertation delves into predictive modeling for estimating the Length of Stay (LOS) of Intensive Care Unit (ICU) patients diagnosed with pneumonia. It employs a Deep Learning (DL)-based approach and compares it with Random Forest (RF) and Linear Regression (LR) models using the VertiCare database. Model evaluation employs various metrics including R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a novel Tolerance metric (accuracy of predictions within a range or interval of 3 days). The results begin with a baseline approach and proceed with incremental improvements. These enhancements include additional variables, codifications, patient group segmentation, and Box- Cox transformations. Notably, patient segmentation, involving exclusively patients diagnosed with pneumonia at hospital admission and excluding the patients with database prescription gaps, emerges as a significant catalyst for improved model performance. RF stood out achieving R2 and RMSE values around 0.62 and 5.7, respectively without Box-Cox transformation on LOS. With Box-Cox transformation, MAE and Tolerance got better results, yielding values of approximately 3.0 and 73.5%, respectively with RF. In the quest to optimize the DL model, the focus shifts to weight pruning aiming to reduce the complexity of the model while maximizing its predictive power. Two distinct pruning methodologies are explored, namely All-at-Once (involving sequential higher pruning percentages on the original model) and Iterative (entailing sequential pruning percentages applied to the previous pruned model. The Iterative pruning approach emerges as a superior option, achieving R2 values of around 0.65 and MAE values of approximately 3.0 for DL models with pruning percentages of up to 80%. This underscores the potential of Iterative pruning as a valuable optimization tool. Additionally, a novel feature selection technique is introduced. This method combines the insights from the Iterative pruning with feature importance based on the magnitude of the weights and reveals the optimal number of features as 12 out of 33 tested, combined with an 80% pruning threshold aiming a balance between model complexity and predictive accuracy. This work pioneers the application of these techniques in LOS prediction for ICU pneumonia patients using this database, offering insights for hospital resource management and decision-making.
Main Authors:	Carvalho, João Miguel Santos
Subject:	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
Year:	2023
Country:	Portugal
Document type:	master thesis
Access type:	open access
Associated institution:	Universidade de Aveiro
Language:	English
Origin:	RIA - Repositório Institucional da Universidade de Aveiro

Description
Summary:	This dissertation delves into predictive modeling for estimating the Length of Stay (LOS) of Intensive Care Unit (ICU) patients diagnosed with pneumonia. It employs a Deep Learning (DL)-based approach and compares it with Random Forest (RF) and Linear Regression (LR) models using the VertiCare database. Model evaluation employs various metrics including R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a novel Tolerance metric (accuracy of predictions within a range or interval of 3 days). The results begin with a baseline approach and proceed with incremental improvements. These enhancements include additional variables, codifications, patient group segmentation, and Box- Cox transformations. Notably, patient segmentation, involving exclusively patients diagnosed with pneumonia at hospital admission and excluding the patients with database prescription gaps, emerges as a significant catalyst for improved model performance. RF stood out achieving R2 and RMSE values around 0.62 and 5.7, respectively without Box-Cox transformation on LOS. With Box-Cox transformation, MAE and Tolerance got better results, yielding values of approximately 3.0 and 73.5%, respectively with RF. In the quest to optimize the DL model, the focus shifts to weight pruning aiming to reduce the complexity of the model while maximizing its predictive power. Two distinct pruning methodologies are explored, namely All-at-Once (involving sequential higher pruning percentages on the original model) and Iterative (entailing sequential pruning percentages applied to the previous pruned model. The Iterative pruning approach emerges as a superior option, achieving R2 values of around 0.65 and MAE values of approximately 3.0 for DL models with pruning percentages of up to 80%. This underscores the potential of Iterative pruning as a valuable optimization tool. Additionally, a novel feature selection technique is introduced. This method combines the insights from the Iterative pruning with feature importance based on the magnitude of the weights and reveals the optimal number of features as 12 out of 33 tested, combined with an 80% pruning threshold aiming a balance between model complexity and predictive accuracy. This work pioneers the application of these techniques in LOS prediction for ICU pneumonia patients using this database, offering insights for hospital resource management and decision-making.