Autor(es):
Marcondes, Francisco Supino ; Gala, Adelino de C.O.S. ; Rodrigues, Manuel ; Almeida, J. J. ; Novais, Paulo
Data: 2025
Identificador Persistente: https://hdl.handle.net/1822/95171
Origem: RepositóriUM - Universidade do Minho
Projeto/bolsa:
info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F00319%2F2020/PT;
Assunto(s): ChatGPT; Lexicon annotation; LLMs; NLP; Ciências Naturais::Ciências da Computação e da Informação
Descrição
Lexicon annotation is a critical yet time-consuming task that can hold back the progress of language-intensive projects. This paper explores the potential of Large Language Models (LLMs) to automate lexicon annotation, traditionally performed by humans. We present a proof of concept by evaluating ChatGPT's performance on annotating VADER's sentiment lexicon. Our findings demonstrate that ChatGPT achieves fair performance in this task, suggesting that LLMs can operate as a valuable tool for initial annotations, with subsequent refinements by domain specialists. This approach could significantly accelerate lexicon development and maintenance while balancing efficiency and accuracy. Our study provides insights into the capabilities and limitations of LLMs in lexicon annotation, leading the way for further research in automating linguistic resources development.
This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020.