Publicação

Transfer Learning for Automatic Essay Scoring

Ver documento

Detalhes bibliográficos
Resumo: Automatic Essay Scoring is a field that has been receiving a lot of attention in Portuguese. Among the available datasets, one stands out: a corpus of narrative essays written by students from 5th to 9th grade in Brazil. These essays were evaluated according to four traits: formal register, thematic coherence, narrative rhetorical structure, and textual cohesion. This~work explores the development of a system based on knowledge from another dataset (developed from texts produced for the Brazilian national entrance exam, ENEM) and from other tasks (textual complexity and legibility analysis). This developed system combines neural models, handcrafted features calculated by textual analysis software, and feature selection, through a Two Stage Learning algorithm. With this system, the state-of-the-art performance was enhanced by 9% for the first trait, 5.5% for the third, and 8.9% for the fourth one.
Autores principais:Silveira, Igor Cataneo
Outros Autores:Ribeiro, Eugénio; Mamede, Nuno; Baptista, Jorge
Assunto:automatic essay scoring narrative Portuguese correção automática de redação narrativa português
Ano:2025
País:Portugal
Tipo de documento:artigo
Tipo de acesso:unknown
Instituição associada:Universidade do Minho & Universidade de Vigo
Idioma:português
Origem:Linguamática
Descrição
Resumo: Automatic Essay Scoring is a field that has been receiving a lot of attention in Portuguese. Among the available datasets, one stands out: a corpus of narrative essays written by students from 5th to 9th grade in Brazil. These essays were evaluated according to four traits: formal register, thematic coherence, narrative rhetorical structure, and textual cohesion. This~work explores the development of a system based on knowledge from another dataset (developed from texts produced for the Brazilian national entrance exam, ENEM) and from other tasks (textual complexity and legibility analysis). This developed system combines neural models, handcrafted features calculated by textual analysis software, and feature selection, through a Two Stage Learning algorithm. With this system, the state-of-the-art performance was enhanced by 9% for the first trait, 5.5% for the third, and 8.9% for the fourth one.