Publicação

Applying LLM-based entity matching for hierarchical product categorization in e-commerce

Ver documento

Detalhes bibliográficos
Resumo:This research explored techniques to improve Large Language Models performance for Hi erarchical Product Classification (HPC), including optimized fine-tuning, optimal prompting techniques, taxonomy-specific Knowledge Graphs, leveraging Retrieval-Augmented Genera tion, and implementing LLM-based Entity Matching. Tested on benchmark datasets Icecat and WDC-222, these methods significantly enhanced LLMs’ ability to solve HPC tasks across var ious scenarios. Results achieved a hierarchical F1-score (hF) of 0.921, surpassing traditional DL benchmarks (0.85 hF). While not outperforming proprietary models like GPT, the proposed approaches offer a cost-efficient and effective alternative for businesses, demonstrating strong performance without reliance on expensive LLM solutions.
Autores principais:Markwardt, Elias
Assunto:Large Language Models Hierarchical classification E-Commerce In-Context Learning Fine tuning Prompt Engineering Knowledge graphs Retrieval Augmented Generation Entity matching
Ano:2025
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso embargado
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:This research explored techniques to improve Large Language Models performance for Hi erarchical Product Classification (HPC), including optimized fine-tuning, optimal prompting techniques, taxonomy-specific Knowledge Graphs, leveraging Retrieval-Augmented Genera tion, and implementing LLM-based Entity Matching. Tested on benchmark datasets Icecat and WDC-222, these methods significantly enhanced LLMs’ ability to solve HPC tasks across var ious scenarios. Results achieved a hierarchical F1-score (hF) of 0.921, surpassing traditional DL benchmarks (0.85 hF). While not outperforming proprietary models like GPT, the proposed approaches offer a cost-efficient and effective alternative for businesses, demonstrating strong performance without reliance on expensive LLM solutions.