Publicação

Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia

Detalhes bibliográficos
Resumo:	This dissertation delves into predictive modeling for estimating the Length of Stay (LOS) of Intensive Care Unit (ICU) patients diagnosed with pneumonia. It employs a Deep Learning (DL)-based approach and compares it with Random Forest (RF) and Linear Regression (LR) models using the VertiCare database. Model evaluation employs various metrics including R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a novel Tolerance metric (accuracy of predictions within a range or interval of 3 days). The results begin with a baseline approach and proceed with incremental improvements. These enhancements include additional variables, codifications, patient group segmentation, and Box- Cox transformations. Notably, patient segmentation, involving exclusively patients diagnosed with pneumonia at hospital admission and excluding the patients with database prescription gaps, emerges as a significant catalyst for improved model performance. RF stood out achieving R2 and RMSE values around 0.62 and 5.7, respectively without Box-Cox transformation on LOS. With Box-Cox transformation, MAE and Tolerance got better results, yielding values of approximately 3.0 and 73.5%, respectively with RF. In the quest to optimize the DL model, the focus shifts to weight pruning aiming to reduce the complexity of the model while maximizing its predictive power. Two distinct pruning methodologies are explored, namely All-at-Once (involving sequential higher pruning percentages on the original model) and Iterative (entailing sequential pruning percentages applied to the previous pruned model. The Iterative pruning approach emerges as a superior option, achieving R2 values of around 0.65 and MAE values of approximately 3.0 for DL models with pruning percentages of up to 80%. This underscores the potential of Iterative pruning as a valuable optimization tool. Additionally, a novel feature selection technique is introduced. This method combines the insights from the Iterative pruning with feature importance based on the magnitude of the weights and reveals the optimal number of features as 12 out of 33 tested, combined with an 80% pruning threshold aiming a balance between model complexity and predictive accuracy. This work pioneers the application of these techniques in LOS prediction for ICU pneumonia patients using this database, offering insights for hospital resource management and decision-making.
Autores principais:	Carvalho, João Miguel Santos
Assunto:	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
Ano:	2023
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade de Aveiro
Idioma:	inglês
Origem:	RIA - Repositório Institucional da Universidade de Aveiro

_version_	1866173224617771008
author	Carvalho, João Miguel Santos
author_facet	Carvalho, João Miguel Santos
author_role	author
country_str	PT
creators_json_txt	[{\"Person.name\":\"Carvalho, João Miguel Santos\"}]
datacite.creators.creator.creatorName.fl_str_mv	Carvalho, João Miguel Santos
datacite.date.Accepted.fl_str_mv	2023-11-02T00:00:00Z
datacite.date.available.fl_str_mv	2024-05-14T10:30:02Z
datacite.date.embargoed.fl_str_mv	2024-05-14T10:30:02Z
datacite.rights.fl_str_mv	http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
datacite.titles.title.fl_str_mv	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
dc.creator.none.fl_str_mv	Carvalho, João Miguel Santos
dc.date.Accepted.fl_str_mv	2023-11-02T00:00:00Z
dc.date.available.fl_str_mv	2024-05-14T10:30:02Z
dc.date.embargoed.fl_str_mv	2024-05-14T10:30:02Z
dc.description.none.fl_str_mv	Esta dissertação aborda a modelagem preditiva para estimar o Tempo de Internamento de pacientes admitidos em Unidades de Cuidados Intensivos diagnosticados com pneumonia. Utiliza-se uma abordagem baseada em Deep Learning e a compara com modelos Random Forest e Regressão Linear, utilizando uma base de dados da VertiCare. A avaliação dos modelos utiliza diversas métricas, incluindo R-squared, Erro Medio Absoluto, Raiz do Erro Quadrático Médio e uma nova métrica de Tolerância (precisão das predições dentro de um intervalo de 3 dias). Os resultados começam com uma abordagem de referência e prosseguem com melhorias incrementais. Essas melhorias incluem variáveis adicionais, codificações, segmentação de grupos de pacientes e transformações Box-Cox. Notavelmente, a segmentação de pacientes, envolvendo exclusivamente pacientes diagnosticados com pneumonia na admissão hospitalar e excluindo pacientes com lacunas de prescrições na base de dados, emerge como um catalisador significativo para o aprimoramento do desempenho dos modelos. Random Forest se destacou, alcançando valores de R-squared e Raiz do Erro Quadrático Médio em torno de 0,62 e 5,7, respetivamente, sem a transformação Box-Cox. Com a aplicação da transformação Box-Cox, o Erro Médio Absoluto e a Tolerância obtiveram melhores resultados, produzindo valores de aproximadamente 3,0 e 73,5%, respetivamente. Na busca por otimizar o modelo de Deep Learning, o foco se volta para a poda de pesos visando reduzir a complexidade do modelo e maximizar seu poder preditivo. Duas metodologias distintas de poda são exploradas, nomeadamente” All-at-Once” (envolvendo percentagens de poda sequenciais mais altas no modelo original) e ” Iterative” (com percentagens de poda sequenciais aplicadas ao modelo podado anteriormente). A abordagem de poda iterativa se destaca como a opção superior, alcançando valores de R-squared em torno de 0,65 e valores de Erro Médio Absoluto de aproximadamente 3,0 para modelos de Deep Learning com percentagens de poda de até 80%. Isso destaca o potencial da poda iterativa como uma valiosa ferramenta de otimização. Adicionalmente, é introduzida uma nova técnica de seleção de variáveis. Este método combina os insights da poda iterativa com a importância das variáveis e revelou que um possível número ideal de atributos é 12 dos 33 testados, combinados com um valor de poda de 80%, visando um equilíbrio entre a complexidade do modelo e a precisão das predições. Este trabalho é pioneiro na aplicação dessas técnicas na predição do Tempo de Internamento para pacientes com pneumonia admitidos Unidades de Cuidados Intensivos usando essa base de dados, oferecendo insights para a gestão de recursos hospitalares e tomada de decisões.
dc.format.none.fl_str_mv	application/pdf
dc.identifier.none.fl_str_mv	http://hdl.handle.net/10773/41871
dc.language.none.fl_str_mv	eng
dc.rights.none.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
dc.title.fl_str_mv	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
dc.type.none.fl_str_mv	http://purl.org/coar/resource_type/c_bdcc
description	This dissertation delves into predictive modeling for estimating the Length of Stay (LOS) of Intensive Care Unit (ICU) patients diagnosed with pneumonia. It employs a Deep Learning (DL)-based approach and compares it with Random Forest (RF) and Linear Regression (LR) models using the VertiCare database. Model evaluation employs various metrics including R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a novel Tolerance metric (accuracy of predictions within a range or interval of 3 days). The results begin with a baseline approach and proceed with incremental improvements. These enhancements include additional variables, codifications, patient group segmentation, and Box- Cox transformations. Notably, patient segmentation, involving exclusively patients diagnosed with pneumonia at hospital admission and excluding the patients with database prescription gaps, emerges as a significant catalyst for improved model performance. RF stood out achieving R2 and RMSE values around 0.62 and 5.7, respectively without Box-Cox transformation on LOS. With Box-Cox transformation, MAE and Tolerance got better results, yielding values of approximately 3.0 and 73.5%, respectively with RF. In the quest to optimize the DL model, the focus shifts to weight pruning aiming to reduce the complexity of the model while maximizing its predictive power. Two distinct pruning methodologies are explored, namely All-at-Once (involving sequential higher pruning percentages on the original model) and Iterative (entailing sequential pruning percentages applied to the previous pruned model. The Iterative pruning approach emerges as a superior option, achieving R2 values of around 0.65 and MAE values of approximately 3.0 for DL models with pruning percentages of up to 80%. This underscores the potential of Iterative pruning as a valuable optimization tool. Additionally, a novel feature selection technique is introduced. This method combines the insights from the Iterative pruning with feature importance based on the magnitude of the weights and reveals the optimal number of features as 12 out of 33 tested, combined with an 80% pruning threshold aiming a balance between model complexity and predictive accuracy. This work pioneers the application of these techniques in LOS prediction for ICU pneumonia patients using this database, offering insights for hospital resource management and decision-making.
dirty	0
eu_rights_str_mv	openAccess
format	masterThesis
id	ria_e705d71e4c4f83f5a99f86f45ede930a
identifier.url.fl_str_mv	http://hdl.handle.net/10773/41871
instacron_str	ua
institution	Universidade de Aveiro
instname_str	Universidade de Aveiro
language	eng
network_acronym_str	ria
network_name_str	RIA - Repositório Institucional da Universidade de Aveiro
oai_identifier_str	oai:ria.ua.pt:10773/41871
organization_str_mv	urn:organizationAcronym:ua
person_str_mv	Carvalho, João Miguel Santos
publishDate	2023
reponame_str	RIA - Repositório Institucional da Universidade de Aveiro
repository_id_str	urn:repositoryAcronym:ria
service_str_mv	urn:repositoryAcronym:ria
spelling	pt_PTThis dissertation delves into predictive modeling for estimating the Length of Stay (LOS) of Intensive Care Unit (ICU) patients diagnosed with pneumonia. It employs a Deep Learning (DL)-based approach and compares it with Random Forest (RF) and Linear Regression (LR) models using the VertiCare database. Model evaluation employs various metrics including R-squared (R2), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a novel Tolerance metric (accuracy of predictions within a range or interval of 3 days). The results begin with a baseline approach and proceed with incremental improvements. These enhancements include additional variables, codifications, patient group segmentation, and Box- Cox transformations. Notably, patient segmentation, involving exclusively patients diagnosed with pneumonia at hospital admission and excluding the patients with database prescription gaps, emerges as a significant catalyst for improved model performance. RF stood out achieving R2 and RMSE values around 0.62 and 5.7, respectively without Box-Cox transformation on LOS. With Box-Cox transformation, MAE and Tolerance got better results, yielding values of approximately 3.0 and 73.5%, respectively with RF. In the quest to optimize the DL model, the focus shifts to weight pruning aiming to reduce the complexity of the model while maximizing its predictive power. Two distinct pruning methodologies are explored, namely All-at-Once (involving sequential higher pruning percentages on the original model) and Iterative (entailing sequential pruning percentages applied to the previous pruned model. The Iterative pruning approach emerges as a superior option, achieving R2 values of around 0.65 and MAE values of approximately 3.0 for DL models with pruning percentages of up to 80%. This underscores the potential of Iterative pruning as a valuable optimization tool. Additionally, a novel feature selection technique is introduced. This method combines the insights from the Iterative pruning with feature importance based on the magnitude of the weights and reveals the optimal number of features as 12 out of 33 tested, combined with an 80% pruning threshold aiming a balance between model complexity and predictive accuracy. This work pioneers the application of these techniques in LOS prediction for ICU pneumonia patients using this database, offering insights for hospital resource management and decision-making.pt_PTEsta dissertação aborda a modelagem preditiva para estimar o Tempo de Internamento de pacientes admitidos em Unidades de Cuidados Intensivos diagnosticados com pneumonia. Utiliza-se uma abordagem baseada em Deep Learning e a compara com modelos Random Forest e Regressão Linear, utilizando uma base de dados da VertiCare. A avaliação dos modelos utiliza diversas métricas, incluindo R-squared, Erro Medio Absoluto, Raiz do Erro Quadrático Médio e uma nova métrica de Tolerância (precisão das predições dentro de um intervalo de 3 dias). Os resultados começam com uma abordagem de referência e prosseguem com melhorias incrementais. Essas melhorias incluem variáveis adicionais, codificações, segmentação de grupos de pacientes e transformações Box-Cox. Notavelmente, a segmentação de pacientes, envolvendo exclusivamente pacientes diagnosticados com pneumonia na admissão hospitalar e excluindo pacientes com lacunas de prescrições na base de dados, emerge como um catalisador significativo para o aprimoramento do desempenho dos modelos. Random Forest se destacou, alcançando valores de R-squared e Raiz do Erro Quadrático Médio em torno de 0,62 e 5,7, respetivamente, sem a transformação Box-Cox. Com a aplicação da transformação Box-Cox, o Erro Médio Absoluto e a Tolerância obtiveram melhores resultados, produzindo valores de aproximadamente 3,0 e 73,5%, respetivamente. Na busca por otimizar o modelo de Deep Learning, o foco se volta para a poda de pesos visando reduzir a complexidade do modelo e maximizar seu poder preditivo. Duas metodologias distintas de poda são exploradas, nomeadamente” All-at-Once” (envolvendo percentagens de poda sequenciais mais altas no modelo original) e ” Iterative” (com percentagens de poda sequenciais aplicadas ao modelo podado anteriormente). A abordagem de poda iterativa se destaca como a opção superior, alcançando valores de R-squared em torno de 0,65 e valores de Erro Médio Absoluto de aproximadamente 3,0 para modelos de Deep Learning com percentagens de poda de até 80%. Isso destaca o potencial da poda iterativa como uma valiosa ferramenta de otimização. Adicionalmente, é introduzida uma nova técnica de seleção de variáveis. Este método combina os insights da poda iterativa com a importância das variáveis e revelou que um possível número ideal de atributos é 12 dos 33 testados, combinados com um valor de poda de 80%, visando um equilíbrio entre a complexidade do modelo e a precisão das predições. Este trabalho é pioneiro na aplicação dessas técnicas na predição do Tempo de Internamento para pacientes com pneumonia admitidos Unidades de Cuidados Intensivos usando essa base de dados, oferecendo insights para a gestão de recursos hospitalares e tomada de decisões.application/pdfengpt_PTDeep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumoniaCarvalho, João Miguel SantosHandlehttp://hdl.handle.net/10773/418712024-05-14T10:30:02Z2023-11-02T00:00:00Z2023-11-02http://purl.org/coar/access_right/c_abf2open accesspt_PTMachine learningpt_PTDeep learningpt_PTRegressionpt_PTPruningpt_PTData sciencept_PTHealthcarept_PTLength of staypt_PTPneumoniapt_PTIntensive care4892034 byteshttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://ria.ua.pt/bitstream/10773/41871/1/Documento_Jo%c3%a3o_Carvalho.pdfother research producthttp://purl.org/coar/resource_type/c_bdccmaster thesis
spellingShingle	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia Carvalho, João Miguel Santos Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
status	SINGLETON
subject.fl_str_mv	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
title	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
title_full	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
title_fullStr	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
title_full_unstemmed	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
title_short	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
title_sort	Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia
topic	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
topic_facet	Machine learning Deep learning Regression Pruning Data science Healthcare Length of stay Pneumonia Intensive care
url	http://hdl.handle.net/10773/41871
visible	1

Publicação

Deep learning in the prediction of the length of stay for intensive care unit patients diagnosed with pneumonia

Registos relacionados