Publicação

Predicting Self-Regulated Learning Skills using Learning Analytics in Moodle: Towards Precision Education

Ver documento

Detalhes bibliográficos
Resumo:Traditional “one-size-fits-all” education paradigm often produces generalized predictive models that overlook different learning environments, affecting academic success prediction. Precision education has emerged as a data-driven approach to personalized learning. Learning Management Systems (LMSs) offer rich behavioral data, but this data often lacks a theoretical basis. Self-regulated learning (SRL) theory provides a framework to address this, but it is usually measured through self-reports, which are biased and difficult to scale. Building on prior research that used clickstream data as an objective proxy for SRL's time management subscale, this study broadens the scope at NOVA Information Management School. It investigates whether Moodle LMS clickstream data can predict students’ SRL skills in three subscales (time management, effort regulation and peer learning) with course-specific models. Data were collected from two graduate courses (Course A and Course B) in the first semester of 2024/2025. SRL targets were extracted from the Motivated Strategies for Learning Questionnaire. Each dataset–target pair was processed through a pipeline with ten configurations and seven algorithms, combining feature selection, dimensionality reduction, and data augmentation techniques. The top three models were shortlisted, tuned, and the best model selected. For each SRL subscale, the most effective dataset was identified by comparing final models. Parametric models generally outperformed non-parametric ones. The best-performing models often showed moderate predictive performance, with models from Course B outperforming those from Course A in two of three subscales. Among the subscales, effort regulation achieved the lowest mean absolute error on the test (MAE = 0.73), followed by time management (MAE = 0.83). In contrast, peer learning was the most challenging subscale to predict (MAE = 1.23), likely due to its offline and social characteristics. Overall, feature–target relationships were weak, and most final models used only the minimum features, indicating limited signal. Additional limitations included sample imbalance, static feature design and reliance on subjective self-report measures. Still, this study offers an initial step toward SRL prediction through behavioral learning analytics. Future research should expand the dataset, adopt time-dependent features, and improve theoretical alignment to support more robust and interpretable SRL prediction in precision education.
Autores principais:Cabral, Mariana Rodrigues
Assunto:Clickstream data Learning analytics Moodle Precision education Self-regulated learning SDG 4 - Quality education SDG 9 - Industry, innovation and infrastructure
Ano:2025
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Traditional “one-size-fits-all” education paradigm often produces generalized predictive models that overlook different learning environments, affecting academic success prediction. Precision education has emerged as a data-driven approach to personalized learning. Learning Management Systems (LMSs) offer rich behavioral data, but this data often lacks a theoretical basis. Self-regulated learning (SRL) theory provides a framework to address this, but it is usually measured through self-reports, which are biased and difficult to scale. Building on prior research that used clickstream data as an objective proxy for SRL's time management subscale, this study broadens the scope at NOVA Information Management School. It investigates whether Moodle LMS clickstream data can predict students’ SRL skills in three subscales (time management, effort regulation and peer learning) with course-specific models. Data were collected from two graduate courses (Course A and Course B) in the first semester of 2024/2025. SRL targets were extracted from the Motivated Strategies for Learning Questionnaire. Each dataset–target pair was processed through a pipeline with ten configurations and seven algorithms, combining feature selection, dimensionality reduction, and data augmentation techniques. The top three models were shortlisted, tuned, and the best model selected. For each SRL subscale, the most effective dataset was identified by comparing final models. Parametric models generally outperformed non-parametric ones. The best-performing models often showed moderate predictive performance, with models from Course B outperforming those from Course A in two of three subscales. Among the subscales, effort regulation achieved the lowest mean absolute error on the test (MAE = 0.73), followed by time management (MAE = 0.83). In contrast, peer learning was the most challenging subscale to predict (MAE = 1.23), likely due to its offline and social characteristics. Overall, feature–target relationships were weak, and most final models used only the minimum features, indicating limited signal. Additional limitations included sample imbalance, static feature design and reliance on subjective self-report measures. Still, this study offers an initial step toward SRL prediction through behavioral learning analytics. Future research should expand the dataset, adopt time-dependent features, and improve theoretical alignment to support more robust and interpretable SRL prediction in precision education.