Publicação
Variable selection methods in high-dimensional regression: a simulation study
| Resumo: | A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data. |
|---|---|
| Autores principais: | Shahriari, Shirin |
| Outros Autores: | Faria, Susana; Gonçalves, A. Manuela |
| Assunto: | High-dimensional data Partial least-squares regression Principle component regression Variable selection Bootstrap |
| Ano: | 2015 |
| País: | Portugal |
| Tipo de documento: | artigo |
| Tipo de acesso: | acesso restrito |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data. |
|---|