Publicação

Priority-Elastic net for binary disease outcome prediction based on multi-omics data

Ver documento

Detalhes bibliográficos
Resumo:Background: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application.
Autores principais:Musib, Laila
Outros Autores:Coletti, Roberta; Lopes, Marta B.; Mouriño, Helena; Carrasquinha, Eunice
Assunto:Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being
Ano:2024
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
_version_ 1868415415886020608
author Musib, Laila
author2 Coletti, Roberta
Lopes, Marta B.
Mouriño, Helena
Carrasquinha, Eunice
author2_role author
author
author
author
author_facet Musib, Laila
Coletti, Roberta
Lopes, Marta B.
Mouriño, Helena
Carrasquinha, Eunice
author_role author
contributor_name_str_mv Faculdade de Ciências e Tecnologia (FCT)
CMA - Centro de Matemática e Aplicações
UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial
BioMed Central (BMC)
RUN
country_str PT
creators_json_txt [{\"Person.name\":\"Musib, Laila\"},{\"Person.name\":\"Coletti, Roberta\"},{\"Person.name\":\"Lopes, Marta B.\"},{\"Person.name\":\"Mouriño, Helena\"},{\"Person.name\":\"Carrasquinha, Eunice\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Faculdade de Ciências e Tecnologia (FCT)
CMA - Centro de Matemática e Aplicações
UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial
BioMed Central (BMC)
RUN
datacite.creators.creator.creatorName.fl_str_mv Musib, Laila
Coletti, Roberta
Lopes, Marta B.
Mouriño, Helena
Carrasquinha, Eunice
datacite.date.Accepted.fl_str_mv 2024-12-01T00:00:00Z
datacite.date.available.fl_str_mv 2025-02-12T21:19:56Z
datacite.date.embargoed.fl_str_mv 2025-02-12T21:19:56Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
datacite.titles.title.fl_str_mv Priority-Elastic net for binary disease outcome prediction based on multi-omics data
dc.contributor.none.fl_str_mv Faculdade de Ciências e Tecnologia (FCT)
CMA - Centro de Matemática e Aplicações
UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial
BioMed Central (BMC)
RUN
dc.creator.none.fl_str_mv Musib, Laila
Coletti, Roberta
Lopes, Marta B.
Mouriño, Helena
Carrasquinha, Eunice
dc.date.Accepted.fl_str_mv 2024-12-01T00:00:00Z
dc.date.available.fl_str_mv 2025-02-12T21:19:56Z
dc.date.embargoed.fl_str_mv 2025-02-12T21:19:56Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://hdl.handle.net/10362/178923
dc.language.none.fl_str_mv eng
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
dc.title.fl_str_mv Priority-Elastic net for binary disease outcome prediction based on multi-omics data
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_6501
description Background: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application.
dirty 0
eu_rights_str_mv openAccess
format article
fulltext.url.fl_str_mv https://run.unl.pt/bitstreams/b73e65e9-0ae8-414d-b814-3e02795dcded/download
funding.funder.alternateName_str_mv FCT
FCT
FCT
FCT
FCT
funding.funder.identifier_str_mv http://doi.org/10.13039/501100001871
http://doi.org/10.13039/501100001871
http://doi.org/10.13039/501100001871
http://doi.org/10.13039/501100001871
http://doi.org/10.13039/501100001871
funding.funder.name_str_mv Fundação para a Ciência e a Tecnologia
Fundação para a Ciência e a Tecnologia
Fundação para a Ciência e a Tecnologia
Fundação para a Ciência e a Tecnologia
Fundação para a Ciência e a Tecnologia
funding.name_str_mv 6817 - DCRRNI ID
6817 - DCRRNI ID
6817 - DCRRNI ID
6817 - DCRRNI ID
3599-PPCDT
id run_6c139bbdd1543554fd435ef6934bccbf
identifier.url.fl_str_mv http://hdl.handle.net/10362/178923
instacron_str unl
institution Universidade Nova de Lisboa
instname_str Universidade Nova de Lisboa
language eng
network_acronym_str run
network_name_str Repositório Institucional da UNL
oai_identifier_str oai:run.unl.pt:10362/178923
organization_str_mv urn:organizationAcronym:unl
person_str_mv Musib, Laila
Coletti, Roberta
Lopes, Marta B.
Mouriño, Helena
Carrasquinha, Eunice
publishDate 2024
reponame_str Repositório Institucional da UNL
repository_id_str urn:repositoryAcronym:run
service_str_mv urn:repositoryAcronym:run
spelling engenBackground: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application.application/pdfenPriority-Elastic net for binary disease outcome prediction based on multi-omics dataMusib, LailaColetti, RobertaLopes, Marta B.Mouriño, HelenaCarrasquinha, EuniceFaculdade de Ciências e Tecnologia (FCT)CMA - Centro de Matemática e AplicaçõesUNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e IndustrialBioMed Central (BMC)HostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.ptISSNIsPartOf1756-0381URNIsPartOfPURE: 107717659URNIsPartOfPURE UUID: edeb2a33-eaf2-49f2-ab87-63f76eff4397URNIsPartOfScopus: 85208247672URNIsPartOfWOS: 001344815100001URNIsPartOfORCID: /0000-0002-4135-1857/work/177968227DOIIsPartOf10.1186/s13040-024-00401-02025-02-12T21:19:56Z2024-122024-12-01T00:00:00ZHandlehttp://hdl.handle.net/10362/178923http://purl.org/coar/access_right/c_abf2open accessAdaptive-Elastic netElastic-netHigh-dimensional dataLogistic regressionMulti-omics dataPriority-LassoBiochemistryMolecular BiologyGeneticsComputer Science ApplicationsComputational Theory and MathematicsComputational MathematicsSDG 3 - Good Health and Well-being3467958 bytesFundação para a Ciência e a TecnologiaCentre of Statistics and its Applications6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaCenter for Mathematics and Applications6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaResearch and Development Unit for Mechanical and Industrial Engineering6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaResearch and Development Unit for Mechanical and Industrial Engineering6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaMulti-omic networks in gliomas3599-PPCDTCrossref Funder IDhttp://doi.org/10.13039/501100001871literaturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/b73e65e9-0ae8-414d-b814-3e02795dcded/download
spellingShingle Priority-Elastic net for binary disease outcome prediction based on multi-omics data
Musib, Laila
Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
status SINGLETON
subject.fl_str_mv Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
title Priority-Elastic net for binary disease outcome prediction based on multi-omics data
title_full Priority-Elastic net for binary disease outcome prediction based on multi-omics data
title_fullStr Priority-Elastic net for binary disease outcome prediction based on multi-omics data
title_full_unstemmed Priority-Elastic net for binary disease outcome prediction based on multi-omics data
title_short Priority-Elastic net for binary disease outcome prediction based on multi-omics data
title_sort Priority-Elastic net for binary disease outcome prediction based on multi-omics data
topic Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
topic_facet Adaptive-Elastic net
Elastic-net
High-dimensional data
Logistic regression
Multi-omics data
Priority-Lasso
Biochemistry
Molecular Biology
Genetics
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics
SDG 3 - Good Health and Well-being
url http://hdl.handle.net/10362/178923
visible 1