Publicação
Variable selection methods in high-dimensional regression: a simulation study
| Resumo: | A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data. |
|---|---|
| Autores principais: | Shahriari, Shirin |
| Outros Autores: | Faria, Susana; Gonçalves, A. Manuela |
| Assunto: | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| Ano: | 2015 |
| País: | Portugal |
| Tipo de documento: | artigo |
| Tipo de acesso: | acesso restrito |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| _version_ | 1866876420481875968 |
|---|---|
| author | Shahriari, Shirin |
| author2 | Faria, Susana Gonçalves, A. Manuela |
| author2_role | author author |
| author_facet | Shahriari, Shirin Faria, Susana Gonçalves, A. Manuela |
| author_role | author |
| contributor_name_str_mv | Universidade do Minho |
| country_str | PT |
| creators_json_txt | [{\"Person.name\":\"Shahriari, Shirin\"},{\"Person.name\":\"Faria, Susana\"},{\"Person.name\":\"Gonçalves, A. Manuela\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Universidade do Minho |
| datacite.creators.creator.creatorName.fl_str_mv | Shahriari, Shirin Faria, Susana Gonçalves, A. Manuela |
| datacite.date.Accepted.fl_str_mv | 2015-01-01T00:00:00Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_16ec |
| datacite.subjects.subject.fl_str_mv | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| datacite.titles.title.fl_str_mv | Variable selection methods in high-dimensional regression: a simulation study |
| dc.contributor.none.fl_str_mv | Universidade do Minho |
| dc.creator.none.fl_str_mv | Shahriari, Shirin Faria, Susana Gonçalves, A. Manuela |
| dc.date.Accepted.fl_str_mv | 2015-01-01T00:00:00Z |
| dc.format.none.fl_str_mv | application/pdf application/pdf |
| dc.identifier.none.fl_str_mv | https://hdl.handle.net/1822/43688 |
| dc.language.none.fl_str_mv | eng |
| dc.publisher.none.fl_str_mv | Taylor and Francis |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_16ec |
| dc.subject.none.fl_str_mv | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| dc.title.fl_str_mv | Variable selection methods in high-dimensional regression: a simulation study |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_6501 |
| description | A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data. |
| dirty | 0 |
| eu_rights_str_mv | restrictedAccess |
| format | article |
| fulltext.url.fl_str_mv | https://prod-dspace.uminho.pt/bitstreams/ec749577-4a5f-4675-ad4c-9c18462abbab/download |
| id | rum_bc2f224f85a2d9364db5bdc4e68c8b88 |
| identifier.url.fl_str_mv | https://hdl.handle.net/1822/43688 |
| instacron_str | repositorium |
| institution | Universidade do Minho |
| instname_str | Universidade do Minho |
| language | eng |
| network_acronym_str | rum |
| network_name_str | RepositóriUM - Universidade do Minho |
| oai_identifier_str | oai:repositorium.uminho.pt:1822/43688 |
| organization_str_mv | urn:organizationAcronym:repositorium |
| person_str_mv | Shahriari, Shirin Faria, Susana Gonçalves, A. Manuela |
| publishDate | 2015 |
| publisher.none.fl_str_mv | Taylor and Francis |
| reponame_str | RepositóriUM - Universidade do Minho |
| repository_id_str | urn:repositoryAcronym:rum |
| service_str_mv | urn:repositoryAcronym:rum |
| spelling | engTaylor and FrancisporA challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data.application/pdfapplication/pdfporVariable selection methods in high-dimensional regression: a simulation studyShahriari, ShirinFaria, SusanaGonçalves, A. ManuelaHostingInstitutionOrganizationalUniversidade do Minhoe-mailmailto:repositorium@usdb.uminho.ptrepositorium@usdb.uminho.ptISSNIsPartOf0361-0918ISSNIsPartOf1532-4141DOIIsPartOf10.1080/03610918.2013.833231201520132015-01-01T00:00:00ZHandlehttps://hdl.handle.net/1822/43688http://purl.org/coar/access_right/c_16ecrestricted accessHigh-dimensional dataPartial least-squares regressionPrinciple component regressionVariable selectionBootstrap696731 bytes100748 bytesliteraturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_16ecapplication/pdffulltexthttps://prod-dspace.uminho.pt/bitstreams/ec749577-4a5f-4675-ad4c-9c18462abbab/downloadhttp://purl.org/coar/access_right/c_16ecapplication/pdffulltexthttps://prod-dspace.uminho.pt/bitstreams/daf468da-87d2-4108-b816-a3d82b437dac/download |
| spellingShingle | Variable selection methods in high-dimensional regression: a simulation study Shahriari, Shirin High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| status | SINGLETON |
| subject.fl_str_mv | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| title | Variable selection methods in high-dimensional regression: a simulation study |
| title_full | Variable selection methods in high-dimensional regression: a simulation study |
| title_fullStr | Variable selection methods in high-dimensional regression: a simulation study |
| title_full_unstemmed | Variable selection methods in high-dimensional regression: a simulation study |
| title_short | Variable selection methods in high-dimensional regression: a simulation study |
| title_sort | Variable selection methods in high-dimensional regression: a simulation study |
| topic | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| topic_facet | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| url | https://hdl.handle.net/1822/43688 |
| visible | 1 |