Automatic Essay Scoring is a field that has been receiving a lot of attention in Portuguese. Among the available datasets, one stands out: a corpus of narrative essays written by students from 5th to 9th grade in Brazil. These essays were evaluated according to four traits: formal register, thematic coherence, narrative rhetorical structure, and textual cohesion. This~work explores the development of a sy...
Automatic Essay Scoring is a field that has been receiving a lot of attention in Portuguese. Among the available datasets, one stands out: a corpus of narrative essays written by students from 5th to 9th grade in Brazil. These essays were evaluated according to four traits: formal register, thematic coherence, narrative rhetorical structure, and textual cohesion. This~work explores the development of a sy...
A tarefa de Correção Automática de Redação tem despertado crescente interesse na ´área de processamento de texto em português. Entre os conjuntos de dados disponíveis, destaca-se um corpus de redações narrativas produzidas por alunos do 5º ao 9º ano do ensino fundamental no Brasil. Essas redações são avaliadas segundo quatro competências: registro formal, coerência temática, estrutura retórica narrativa e coesã...
The assessment of text readability and the classification of texts by complexity levels is essential for language education and language-related industries that rely on effective communication. The Common European Framework of Reference for Languages (CEFR) provides a widely recognized framework for classifying language proficiency levels. This framework can be used not only to assess the proficiency of learner...
Dialog acts reveal the intention behind the uttered words. Thus, their automatic recognition is important for a dialog system trying to understand its conversational partner. The study presented in this article approaches that task on the DIHANA corpus, whose three-level dialog act annotation scheme poses problems which have not been explored in recent studies. In addition to the hierarchical problem, the two l...
Automatic speaker nativeness assessment has multiple applications, such as second language learning and IVR systems. In this paper we view this as a regression problem, since the available labels are on a continuous scale. Multiple approaches were applied, such as phonotactic models, i-vectors, and goodness of pronunciation, covering both segmental and suprasegmental features. Different phonotactic models were ...