Publicação
Lisa: A touristic chatbot for Lisbon
| Resumo: | As Lisbon continues to attract a growing number of visitors, the development of a tailored chatbot catering to the tourists’ unique needs becomes increasingly valuable. In this thesis, we develop an engaging general-purpose chatbot that can fulfill the unique needs of Lisbon’s tourists. Utilizing a web-scraped knowledge base with over 2000 website pages, the chatbot offers recommendations for tourist routes, events and places to visit and answers queries about Lisbon. Two evaluations datasets, one for question-answering and the other for recommendations, were created based on synthetic data. Various experiments, including data preprocessing, exploration of different ChatGPT models, and improvements to the retrievalaugmented generation pipeline, were conducted to improve the chatbot. This thesis contributes to literature on chatbot development, emphasizing the benefits of more advanced machine learning models in the tourism industry. It also demonstrates the potential of iterative optimization of large language models and evaluation based on synthetic data for downstream tasks. |
|---|---|
| Autores principais: | Cruz, Miguel Almeida Coutinho Teixeira da |
| Assunto: | chatbot transformer chatgpt tourism natural language processing SDG 11 - Sustainable cities and communities |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | As Lisbon continues to attract a growing number of visitors, the development of a tailored chatbot catering to the tourists’ unique needs becomes increasingly valuable. In this thesis, we develop an engaging general-purpose chatbot that can fulfill the unique needs of Lisbon’s tourists. Utilizing a web-scraped knowledge base with over 2000 website pages, the chatbot offers recommendations for tourist routes, events and places to visit and answers queries about Lisbon. Two evaluations datasets, one for question-answering and the other for recommendations, were created based on synthetic data. Various experiments, including data preprocessing, exploration of different ChatGPT models, and improvements to the retrievalaugmented generation pipeline, were conducted to improve the chatbot. This thesis contributes to literature on chatbot development, emphasizing the benefits of more advanced machine learning models in the tourism industry. It also demonstrates the potential of iterative optimization of large language models and evaluation based on synthetic data for downstream tasks. |
|---|