Publicação
Priority-Elastic net for binary disease outcome prediction based on multi-omics data
| Resumo: | Background: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application. |
|---|---|
| Autores principais: | Musib, Laila |
| Outros Autores: | Coletti, Roberta; Lopes, Marta B.; Mouriño, Helena; Carrasquinha, Eunice |
| Assunto: | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | artigo |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| _version_ | 1868415415886020608 |
|---|---|
| author | Musib, Laila |
| author2 | Coletti, Roberta Lopes, Marta B. Mouriño, Helena Carrasquinha, Eunice |
| author2_role | author author author author |
| author_facet | Musib, Laila Coletti, Roberta Lopes, Marta B. Mouriño, Helena Carrasquinha, Eunice |
| author_role | author |
| contributor_name_str_mv | Faculdade de Ciências e Tecnologia (FCT) CMA - Centro de Matemática e Aplicações UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial BioMed Central (BMC) RUN |
| country_str | PT |
| creators_json_txt | [{\"Person.name\":\"Musib, Laila\"},{\"Person.name\":\"Coletti, Roberta\"},{\"Person.name\":\"Lopes, Marta B.\"},{\"Person.name\":\"Mouriño, Helena\"},{\"Person.name\":\"Carrasquinha, Eunice\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Faculdade de Ciências e Tecnologia (FCT) CMA - Centro de Matemática e Aplicações UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial BioMed Central (BMC) RUN |
| datacite.creators.creator.creatorName.fl_str_mv | Musib, Laila Coletti, Roberta Lopes, Marta B. Mouriño, Helena Carrasquinha, Eunice |
| datacite.date.Accepted.fl_str_mv | 2024-12-01T00:00:00Z |
| datacite.date.available.fl_str_mv | 2025-02-12T21:19:56Z |
| datacite.date.embargoed.fl_str_mv | 2025-02-12T21:19:56Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| datacite.subjects.subject.fl_str_mv | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| datacite.titles.title.fl_str_mv | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| dc.contributor.none.fl_str_mv | Faculdade de Ciências e Tecnologia (FCT) CMA - Centro de Matemática e Aplicações UNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e Industrial BioMed Central (BMC) RUN |
| dc.creator.none.fl_str_mv | Musib, Laila Coletti, Roberta Lopes, Marta B. Mouriño, Helena Carrasquinha, Eunice |
| dc.date.Accepted.fl_str_mv | 2024-12-01T00:00:00Z |
| dc.date.available.fl_str_mv | 2025-02-12T21:19:56Z |
| dc.date.embargoed.fl_str_mv | 2025-02-12T21:19:56Z |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://hdl.handle.net/10362/178923 |
| dc.language.none.fl_str_mv | eng |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| dc.subject.none.fl_str_mv | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| dc.title.fl_str_mv | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_6501 |
| description | Background: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application. |
| dirty | 0 |
| eu_rights_str_mv | openAccess |
| format | article |
| fulltext.url.fl_str_mv | https://run.unl.pt/bitstreams/b73e65e9-0ae8-414d-b814-3e02795dcded/download |
| funding.funder.alternateName_str_mv | FCT FCT FCT FCT FCT |
| funding.funder.identifier_str_mv | http://doi.org/10.13039/501100001871 http://doi.org/10.13039/501100001871 http://doi.org/10.13039/501100001871 http://doi.org/10.13039/501100001871 http://doi.org/10.13039/501100001871 |
| funding.funder.name_str_mv | Fundação para a Ciência e a Tecnologia Fundação para a Ciência e a Tecnologia Fundação para a Ciência e a Tecnologia Fundação para a Ciência e a Tecnologia Fundação para a Ciência e a Tecnologia |
| funding.name_str_mv | 6817 - DCRRNI ID 6817 - DCRRNI ID 6817 - DCRRNI ID 6817 - DCRRNI ID 3599-PPCDT |
| id | run_6c139bbdd1543554fd435ef6934bccbf |
| identifier.url.fl_str_mv | http://hdl.handle.net/10362/178923 |
| instacron_str | unl |
| institution | Universidade Nova de Lisboa |
| instname_str | Universidade Nova de Lisboa |
| language | eng |
| network_acronym_str | run |
| network_name_str | Repositório Institucional da UNL |
| oai_identifier_str | oai:run.unl.pt:10362/178923 |
| organization_str_mv | urn:organizationAcronym:unl |
| person_str_mv | Musib, Laila Coletti, Roberta Lopes, Marta B. Mouriño, Helena Carrasquinha, Eunice |
| publishDate | 2024 |
| reponame_str | Repositório Institucional da UNL |
| repository_id_str | urn:repositoryAcronym:run |
| service_str_mv | urn:repositoryAcronym:run |
| spelling | engenBackground: High-dimensional omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential to improve predictive models. However, the data integration process faces several challenges, including data heterogeneity, priority sequence in which data blocks are prioritized for rendering predictive information contained in multiple blocks, assessing the flow of information from one omics level to the other and multicollinearity. Methods: We propose the Priority-Elastic net algorithm, a hierarchical regression method extending Priority-Lasso for the binary logistic regression model by incorporating a priority order for blocks of variables while fitting Elastic-net models sequentially for each block. The fitted values from each step are then used as an offset in the subsequent step. Additionally, we considered the adaptive elastic-net penalty within our priority framework to compare the results. Results: The Priority-Elastic net and Priority-Adaptive Elastic net algorithms were evaluated on a brain tumor dataset available from The Cancer Genome Atlas (TCGA), accounting for transcriptomics, proteomics, and clinical information measured over two glioma types: Lower-grade glioma (LGG) and glioblastoma (GBM). Conclusion: Our findings suggest that the Priority-Elastic net is a highly advantageous choice for a wide range of applications. It offers moderate computational complexity, flexibility in integrating prior knowledge while introducing a hierarchical modeling perspective, and, importantly, improved stability and accuracy in predictions, making it superior to the other methods discussed. This evolution marks a significant step forward in predictive modeling, offering a sophisticated tool for navigating the complexities of multi-omics datasets in pursuit of precision medicine’s ultimate goal: personalized treatment optimization based on a comprehensive array of patient-specific data. This framework can be generalized to time-to-event, Cox proportional hazards regression and multicategorical outcomes. A practical implementation of this method is available upon request in R script, complete with an example to facilitate its application.application/pdfenPriority-Elastic net for binary disease outcome prediction based on multi-omics dataMusib, LailaColetti, RobertaLopes, Marta B.Mouriño, HelenaCarrasquinha, EuniceFaculdade de Ciências e Tecnologia (FCT)CMA - Centro de Matemática e AplicaçõesUNIDEMI - Unidade de Investigação e Desenvolvimento em Engenharia Mecânica e IndustrialBioMed Central (BMC)HostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.ptISSNIsPartOf1756-0381URNIsPartOfPURE: 107717659URNIsPartOfPURE UUID: edeb2a33-eaf2-49f2-ab87-63f76eff4397URNIsPartOfScopus: 85208247672URNIsPartOfWOS: 001344815100001URNIsPartOfORCID: /0000-0002-4135-1857/work/177968227DOIIsPartOf10.1186/s13040-024-00401-02025-02-12T21:19:56Z2024-122024-12-01T00:00:00ZHandlehttp://hdl.handle.net/10362/178923http://purl.org/coar/access_right/c_abf2open accessAdaptive-Elastic netElastic-netHigh-dimensional dataLogistic regressionMulti-omics dataPriority-LassoBiochemistryMolecular BiologyGeneticsComputer Science ApplicationsComputational Theory and MathematicsComputational MathematicsSDG 3 - Good Health and Well-being3467958 bytesFundação para a Ciência e a TecnologiaCentre of Statistics and its Applications6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaCenter for Mathematics and Applications6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaResearch and Development Unit for Mechanical and Industrial Engineering6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaResearch and Development Unit for Mechanical and Industrial Engineering6817 - DCRRNI IDCrossref Funder IDhttp://doi.org/10.13039/501100001871Fundação para a Ciência e a TecnologiaMulti-omic networks in gliomas3599-PPCDTCrossref Funder IDhttp://doi.org/10.13039/501100001871literaturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/b73e65e9-0ae8-414d-b814-3e02795dcded/download |
| spellingShingle | Priority-Elastic net for binary disease outcome prediction based on multi-omics data Musib, Laila Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| status | SINGLETON |
| subject.fl_str_mv | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| title | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| title_full | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| title_fullStr | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| title_full_unstemmed | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| title_short | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| title_sort | Priority-Elastic net for binary disease outcome prediction based on multi-omics data |
| topic | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| topic_facet | Adaptive-Elastic net Elastic-net High-dimensional data Logistic regression Multi-omics data Priority-Lasso Biochemistry Molecular Biology Genetics Computer Science Applications Computational Theory and Mathematics Computational Mathematics SDG 3 - Good Health and Well-being |
| url | http://hdl.handle.net/10362/178923 |
| visible | 1 |