Publicação
Topic Modeling of Multilingual Customer Reviews: A Study in the Running Footwear Domain
| Resumo: | This thesis investigates the use of multilingual topic modeling for analyzing short online reviews about running shoes. A custom dataset of approximately 30,000 reviews written in Italian, English, and French was collected through web scraping. The aim was to extract recurring themes from these reviews using BERTopic, a neural topic modeling technique based on sentence embeddings, UMAP, HDBSCAN, and class-based TF-IDF. The analysis followed the CRISP-DM framework and involved multiple iterations of preprocessing, modeling, and evaluation. Given the absence of labeled data, sentiment was approximated using the star rating provided by users. An initial attempt to apply Aspect-Based Sentiment Analysis (ABSA) was discarded due to the lack of annotated data and unsatisfactory early results. The multilingual version of BERTopic successfully revealed interpretable themes such as fit, cushioning, durability, and performance. Nevertheless, several limitations emerged. Many reviews were extremely short or generic, reducing topic coherence. Language imbalance introduced biases in topic frequency, and limited computing power constrained the scope of experimentation. Despite these constraints, the results demonstrate the potential of multilingual topic modeling as a scalable and language-flexible approach to extracting actionable insights from unstructured customer feedback. Future research may improve these outcomes by employing larger transformer-based models, developing better preprocessing for short texts, and incorporating human validation to enhance topic interpretability. |
|---|---|
| Autores principais: | Bovenga, Giulia |
| Assunto: | Topic Modeling BERTopic Multilingual Texts Running Shoes Natural Language Processing |
| Ano: | 2025 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso embargado |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
Registos relacionados
school Customer Review Analysis
por: Tueschen, Philipp
Publicado em: (2022)
por: Tueschen, Philipp
Publicado em: (2022)
school Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
por: Jardim, João Bruno Morais de Sousa
Publicado em: (2021)
por: Jardim, João Bruno Morais de Sousa
Publicado em: (2021)
school Making open-ended questions attractive: leveraging topic modelling to evaluate survey responses
por: Schühle, Annika
Publicado em: (2025)
por: Schühle, Annika
Publicado em: (2025)
school Transforming texts to maps : geovisualizing topics in texts
por: Thapa, Mahesh
Publicado em: (2018)
por: Thapa, Mahesh
Publicado em: (2018)
article MLT-prealigner: a tool for multilingual text alignment
por: Carvalho, Pedro
Publicado em: (2014)
por: Carvalho, Pedro
Publicado em: (2014)
school The application of Anna Karenina Principle on Portuguese Restaurants Online Reviews: A BERTopic Approach
por: Melo, Dinis Cortilho
Publicado em: (2025)
por: Melo, Dinis Cortilho
Publicado em: (2025)
school Idea Engineering: Design and Implementation of a Decision Support System for Generating Research Topics
por: Rodrigues, Carolina Ochoa Gomes
Publicado em: (2026)
por: Rodrigues, Carolina Ochoa Gomes
Publicado em: (2026)
article Topic Modeling
por: Amaro, Ana
Publicado em: (2024)
por: Amaro, Ana
Publicado em: (2024)
school Text Mining Research Project: Internship at Ageas Portugal
por: Teixeira, Daniel Rocha
Publicado em: (2021)
por: Teixeira, Daniel Rocha
Publicado em: (2021)
groups Sentiment analysis in online reviews classification using text mining technique
por: Moreno, A.
Publicado em: (2019)
por: Moreno, A.
Publicado em: (2019)
article Accommodating Multilingualism in Macedonia
por: Treneska-Deskoska, Renata
Publicado em: (2017)
por: Treneska-Deskoska, Renata
Publicado em: (2017)
article Multilingualism within scholarly communication in SSH: a literature review
por: Balula, Ana
Publicado em: (2021)
por: Balula, Ana
Publicado em: (2021)
school Entity Recognition and Linking for Biomedical Documents Applying recent Transformer-based Entity Recognition and Linking Algorithms for the Biomedical Domain, to a Multi-Lingual Scenario
por: Gonçalves, Rodrigo Miguel Gameiro Vilhena
Publicado em: (2024)
por: Gonçalves, Rodrigo Miguel Gameiro Vilhena
Publicado em: (2024)
school Topic modelling: a consistent framework for comparative studies and its practical application
por: Amaro, Ana Margarida Rocha
Publicado em: (2022)
por: Amaro, Ana Margarida Rocha
Publicado em: (2022)
book L'intercompréhension: d'une approche multilingue à une approche plurilingue
por: Capucho, Filomena
Publicado em: (2016)
por: Capucho, Filomena
Publicado em: (2016)
article Collaborative mass customization of footwear: conceptualization of a three-stage holistic model
por: Oliveira, Nelson
Publicado em: (2022)
por: Oliveira, Nelson
Publicado em: (2022)
groups Learning the language of schooling in multilingual contexts
por: Gonçalves, Carolina Maria Dias
Publicado em: (2016)
por: Gonçalves, Carolina Maria Dias
Publicado em: (2016)
article Is there a place for heritage languages in the promotion of an intercultural and multilingual education in the Portuguese schools?
por: Faneca, Rosa Maria
Publicado em: (2016)
por: Faneca, Rosa Maria
Publicado em: (2016)
groups Is multilingualism seen as added-value in bibliodiversity? A literature review focussed on business and research contexts
por: Balula, Ana
Publicado em: (2019)
por: Balula, Ana
Publicado em: (2019)
article Social Inclusion Through Multilingual Assistants in Additional Language Learning
por: St John, Oliver
Publicado em: (2023)
por: St John, Oliver
Publicado em: (2023)
school Multilingual ontologies creation
por: Monteiro, Simão Freitas
Publicado em: (2023)
por: Monteiro, Simão Freitas
Publicado em: (2023)
article Collaborative mass customization in the Portuguese footwear cluster: expectations versus reality
por: Oliveira, Nelson José Novais
Publicado em: (2022)
por: Oliveira, Nelson José Novais
Publicado em: (2022)
article Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
por: Zeyrek, Deniz
Publicado em: (2018)
por: Zeyrek, Deniz
Publicado em: (2018)
article Social Inclusion and Multilingualism: Linguistic Justice and Language Policy
por: Csata, Zsombor
Publicado em: (2021)
por: Csata, Zsombor
Publicado em: (2021)
article A Framework for Multilingual Ontology Mapping
por: Trojahn, Cássia
Publicado em: (2009)
por: Trojahn, Cássia
Publicado em: (2009)
article Multilingual bi-encoder models for biomedical entity linking
por: Guven, Zekeriya Anil
Publicado em: (2023)
por: Guven, Zekeriya Anil
Publicado em: (2023)
article Researching in Multilingual Spaces: Addressing Methodological, Ethical, and Epistemological Implications
por: Holzinger, Clara
Publicado em: (2026)
por: Holzinger, Clara
Publicado em: (2026)
article Urban Multilingualism and the Civic University: A Dynamic, Non-Linear Model of Participatory Research
por: Matras, Yaron
Publicado em: (2017)
por: Matras, Yaron
Publicado em: (2017)
article Towards customized footwear with improved comfort
por: Teixeira, Rafaela Marisa Fernandes
Publicado em: (2021)
por: Teixeira, Rafaela Marisa Fernandes
Publicado em: (2021)
groups Multilingualism and interpreting: training for the EU
por: Antunes Garcia Anacleto Matias, Maria Helena
Publicado em: (2011)
por: Antunes Garcia Anacleto Matias, Maria Helena
Publicado em: (2011)
article Insights on consumer online purchase decisions of women’s footwear
por: Silva, Susana
Publicado em: (2018)
por: Silva, Susana
Publicado em: (2018)
school AI Conversational Agent to solve multilingual administrative questions
por: Alegria, Rodrigo Daniel Sapateiro
Publicado em: (2024)
por: Alegria, Rodrigo Daniel Sapateiro
Publicado em: (2024)
article Multilingual encounters in online video practices: the case of Portuguese university students
por: Shafirova, Liudmila
Publicado em: (2024)
por: Shafirova, Liudmila
Publicado em: (2024)
school Gestão de equipas multilingues – estratégias para promover a compreensão e a colaboração
por: Gomes, Miriam Monteiro
Publicado em: (2025)
por: Gomes, Miriam Monteiro
Publicado em: (2025)
article Translanguaging Towards Equitable Participation: Doing Research Multilingually With People With a Migration Background
por: MacDonald, Erin Gail
Publicado em: (2026)
por: MacDonald, Erin Gail
Publicado em: (2026)
article Sentiment classification of consumer generated online reviews using topic modeling
por: Calheiros, A. C.
Publicado em: (2017)
por: Calheiros, A. C.
Publicado em: (2017)
book The impact of language technologies in the legal domain
por: Trancoso, I.
Publicado em: (2023)
por: Trancoso, I.
Publicado em: (2023)
school Context-driven Semantic Parsing to expand cross-domain Text-to-SQL
por: Nascimento, Inês Daniela Cardoso
Publicado em: (2025)
por: Nascimento, Inês Daniela Cardoso
Publicado em: (2025)
article Conducting Research Across Three Languages in a Multilingual Space: Polish Immigrants in Alanya
por: Karaköse, Gizem
Publicado em: (2026)
por: Karaköse, Gizem
Publicado em: (2026)
school A comunicação multilingue: uma experiência na Delphi (Braga)
por: Castro, Laura Braga de
Publicado em: (2016)
por: Castro, Laura Braga de
Publicado em: (2016)
Registos relacionados
-
school Customer Review Analysis
por: Tueschen, Philipp
Publicado em: (2022) -
school Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
por: Jardim, João Bruno Morais de Sousa
Publicado em: (2021) -
school Making open-ended questions attractive: leveraging topic modelling to evaluate survey responses
por: Schühle, Annika
Publicado em: (2025) -
school Transforming texts to maps : geovisualizing topics in texts
por: Thapa, Mahesh
Publicado em: (2018) -
article MLT-prealigner: a tool for multilingual text alignment
por: Carvalho, Pedro
Publicado em: (2014)