Publicação

The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets

Ver documento

Detalhes bibliográficos
Resumo:Abstract Any quantitative method is shaped by certain rules or assumptions which constitute its own rationale. It is not by chance that these assumptions determine the conditions and constraints which permit the evidence to be constructed. In this article, we argue why the Regression Tree Method’s rationale is more suitable than General Linear Model to analyze complex educational datasets. Furthermore, we apply the CART algorithm of Regression Tree Method and the Multiple Linear Regression in a model with 53 predictors, taking as outcome the students’ scores in reading of the 2011’s edition of the National Exam of Upper Secondary Education (ENEM; N = 3,670,089), which is a complex educational dataset. This empirical comparison illustrates how the Regression Tree Method is better suitable than General Linear Model for furnishing evidence about non-linear relationships, as well as, to deal with nominal variables with many categories and ordinal variables. We conclude that the Regression Tree Method constructs better evidence about the relationships between the predictors and the outcome in complex datasets.
Autores principais:Gomes,Cristiano Mauro Assis
Outros Autores:Lemos,Gina C.; Jelihovschi,Enio G.
Assunto:Regression tree model general linear model National Exam of Upper Secondary Education (ENEM) complex datasets.
Ano:2021
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Fundação para a Ciência e Tecnologia
Idioma:inglês
Origem:SciELO Portugal
Descrição
Resumo:Abstract Any quantitative method is shaped by certain rules or assumptions which constitute its own rationale. It is not by chance that these assumptions determine the conditions and constraints which permit the evidence to be constructed. In this article, we argue why the Regression Tree Method’s rationale is more suitable than General Linear Model to analyze complex educational datasets. Furthermore, we apply the CART algorithm of Regression Tree Method and the Multiple Linear Regression in a model with 53 predictors, taking as outcome the students’ scores in reading of the 2011’s edition of the National Exam of Upper Secondary Education (ENEM; N = 3,670,089), which is a complex educational dataset. This empirical comparison illustrates how the Regression Tree Method is better suitable than General Linear Model for furnishing evidence about non-linear relationships, as well as, to deal with nominal variables with many categories and ordinal variables. We conclude that the Regression Tree Method constructs better evidence about the relationships between the predictors and the outcome in complex datasets.