Publicação

Automating systematic review data extraction and visualization with large language models and hierarchical trees

Ver documento

Detalhes bibliográficos
Resumo:This study explores the integration of retrieval-augmented generation (RAG) and one-shot prompting for structured data extraction from scientific papers, particularly within the context of systematic reviews. It also evaluates the effectiveness of an interactive tree visualization for enabling exploratory data analysis of structured data extracted from these papers. The results demonstrate that the ensemble of RAG and oneshot prompting performs well in extracting structured data, such as dates, names, and numerical values, with high accuracy. However, challenges were observed in handling long-text data, highlighting areas for improvement in contextual understanding. The interactive tree visualization proved to be an effective tool for data exploration, allowing users to navigate and analyze structured data with ease. Feedback indicated high satisfaction with the visualization’s usability and interactivity. This study suggests that combining advanced data extraction techniques with interactive visualizations can significantly enhance the efficiency and effectiveness of systematic reviews and other data-driven research tasks.
Autores principais:Cunha, Francisco Lopes da
Assunto:Retrieval-augmented generation (RAG) One-shot prompting Structured data extraction Systematic reviews automation Interactive tree visualization Geração aumentada por recuperação (RAG) Prompting de um único exemplo Extração de dados estruturados Automação de revisões sistemáticas Visualização interativa em árvore
Ano:2025
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:This study explores the integration of retrieval-augmented generation (RAG) and one-shot prompting for structured data extraction from scientific papers, particularly within the context of systematic reviews. It also evaluates the effectiveness of an interactive tree visualization for enabling exploratory data analysis of structured data extracted from these papers. The results demonstrate that the ensemble of RAG and oneshot prompting performs well in extracting structured data, such as dates, names, and numerical values, with high accuracy. However, challenges were observed in handling long-text data, highlighting areas for improvement in contextual understanding. The interactive tree visualization proved to be an effective tool for data exploration, allowing users to navigate and analyze structured data with ease. Feedback indicated high satisfaction with the visualization’s usability and interactivity. This study suggests that combining advanced data extraction techniques with interactive visualizations can significantly enhance the efficiency and effectiveness of systematic reviews and other data-driven research tasks.