Publicação

Variable selection methods in high-dimensional regression: a simulation study

Ver documento

Detalhes bibliográficos
Resumo:A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data.
Autores principais:Shahriari, Shirin
Outros Autores:Faria, Susana; Gonçalves, A. Manuela
Assunto:High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap
Ano:2015
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso restrito
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
_version_ 1866876420481875968
author Shahriari, Shirin
author2 Faria, Susana
Gonçalves, A. Manuela
author2_role author
author
author_facet Shahriari, Shirin
Faria, Susana
Gonçalves, A. Manuela
author_role author
contributor_name_str_mv Universidade do Minho
country_str PT
creators_json_txt [{\"Person.name\":\"Shahriari, Shirin\"},{\"Person.name\":\"Faria, Susana\"},{\"Person.name\":\"Gonçalves, A. Manuela\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Universidade do Minho
datacite.creators.creator.creatorName.fl_str_mv Shahriari, Shirin
Faria, Susana
Gonçalves, A. Manuela
datacite.date.Accepted.fl_str_mv 2015-01-01T00:00:00Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_16ec
datacite.subjects.subject.fl_str_mv High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
datacite.titles.title.fl_str_mv Variable selection methods in high-dimensional regression: a simulation study
dc.contributor.none.fl_str_mv Universidade do Minho
dc.creator.none.fl_str_mv Shahriari, Shirin
Faria, Susana
Gonçalves, A. Manuela
dc.date.Accepted.fl_str_mv 2015-01-01T00:00:00Z
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.identifier.none.fl_str_mv https://hdl.handle.net/1822/43688
dc.language.none.fl_str_mv eng
dc.publisher.none.fl_str_mv Taylor and Francis
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_16ec
dc.subject.none.fl_str_mv High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
dc.title.fl_str_mv Variable selection methods in high-dimensional regression: a simulation study
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_6501
description A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data.
dirty 0
eu_rights_str_mv restrictedAccess
format article
fulltext.url.fl_str_mv https://prod-dspace.uminho.pt/bitstreams/ec749577-4a5f-4675-ad4c-9c18462abbab/download
id rum_bc2f224f85a2d9364db5bdc4e68c8b88
identifier.url.fl_str_mv https://hdl.handle.net/1822/43688
instacron_str repositorium
institution Universidade do Minho
instname_str Universidade do Minho
language eng
network_acronym_str rum
network_name_str RepositóriUM - Universidade do Minho
oai_identifier_str oai:repositorium.uminho.pt:1822/43688
organization_str_mv urn:organizationAcronym:repositorium
person_str_mv Shahriari, Shirin
Faria, Susana
Gonçalves, A. Manuela
publishDate 2015
publisher.none.fl_str_mv Taylor and Francis
reponame_str RepositóriUM - Universidade do Minho
repository_id_str urn:repositoryAcronym:rum
service_str_mv urn:repositoryAcronym:rum
spelling engTaylor and FrancisporA challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data.application/pdfapplication/pdfporVariable selection methods in high-dimensional regression: a simulation studyShahriari, ShirinFaria, SusanaGonçalves, A. ManuelaHostingInstitutionOrganizationalUniversidade do Minhoe-mailmailto:repositorium@usdb.uminho.ptrepositorium@usdb.uminho.ptISSNIsPartOf0361-0918ISSNIsPartOf1532-4141DOIIsPartOf10.1080/03610918.2013.833231201520132015-01-01T00:00:00ZHandlehttps://hdl.handle.net/1822/43688http://purl.org/coar/access_right/c_16ecrestricted accessHigh-dimensional dataPartial least-squares regressionPrinciple component regressionVariable selectionBootstrap696731 bytes100748 bytesliteraturehttp://purl.org/coar/resource_type/c_6501journal articlehttp://purl.org/coar/access_right/c_16ecapplication/pdffulltexthttps://prod-dspace.uminho.pt/bitstreams/ec749577-4a5f-4675-ad4c-9c18462abbab/downloadhttp://purl.org/coar/access_right/c_16ecapplication/pdffulltexthttps://prod-dspace.uminho.pt/bitstreams/daf468da-87d2-4108-b816-a3d82b437dac/download
spellingShingle Variable selection methods in high-dimensional regression: a simulation study
Shahriari, Shirin
High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
status SINGLETON
subject.fl_str_mv High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
title Variable selection methods in high-dimensional regression: a simulation study
title_full Variable selection methods in high-dimensional regression: a simulation study
title_fullStr Variable selection methods in high-dimensional regression: a simulation study
title_full_unstemmed Variable selection methods in high-dimensional regression: a simulation study
title_short Variable selection methods in high-dimensional regression: a simulation study
title_sort Variable selection methods in high-dimensional regression: a simulation study
topic High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
topic_facet High-dimensional data
Partial least-squares regression
Principle component regression
Variable selection
Bootstrap
url https://hdl.handle.net/1822/43688
visible 1