Publicação
Chatbot for future non-EU students at NOVA: Leveraging LLMs to Support International Student Integration
| Resumo: | As the number of international students at NOVA IMS continues to grow, there is an increasing need for accessible and accurate support, particularly for students from outside the European Union who are dealing with complex administrative and migratory processes. This thesis addresses these challenges by exploring the development and optimization of a chatbot prototype based on small open-source Large Language Models (LLMs) to assist prospective non-EU students at NOVA IMS. The project followed a multi-stage methodology. First, a Retrieval-Augmented Generation (RAG) prototype was built using open-source tools, and the performance of three LLMs —Llama-3.1-8b-instruct, Mistral-7b-instruct-v0.2, and Phi4:14b— was evaluated. Next, different text chunking strategies were compared, including standard recursive character splitting and an AI-driven method that transforms documents into propositions before chunking. The RAG system’s performance was measured using the RAGAS framework, measuring faithfulness, answer relevancy, context precision, and context recall. Results indicated that the Mistral-7b-instruct-v0.2 model delivered the best performance. The AI-driven chunking method outperformed all other approaches across most evaluation metrics. It achieved the highest scores in faithfulness (0.67), context precision (0.60), and context recall (0.56). Notably, it excelled in context recall, where it achieved a score 0.17 points higher (out of 1) than the second-best method. The optimized prototype was tested by end-users using the chatbot Usability Scale (BUS), obtaining positive feedback regarding functionality, while also highlighted the need for more concise answers and improved response time. This project demonstrates the viability of developing effective chatbots with small LLMs and concludes that AI-assisted corpus pre-processing is a useful method with a relevant and positive impact in optimizing RAG system performance. |
|---|---|
| Autores principais: | Neira Rodríguez, Ignacio Javier |
| Assunto: | Chatbot Large Language Models Chatbot Utility Assessment Retrieval-Augmented Generation SDG 4 - Quality education |
| Ano: | 2025 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | As the number of international students at NOVA IMS continues to grow, there is an increasing need for accessible and accurate support, particularly for students from outside the European Union who are dealing with complex administrative and migratory processes. This thesis addresses these challenges by exploring the development and optimization of a chatbot prototype based on small open-source Large Language Models (LLMs) to assist prospective non-EU students at NOVA IMS. The project followed a multi-stage methodology. First, a Retrieval-Augmented Generation (RAG) prototype was built using open-source tools, and the performance of three LLMs —Llama-3.1-8b-instruct, Mistral-7b-instruct-v0.2, and Phi4:14b— was evaluated. Next, different text chunking strategies were compared, including standard recursive character splitting and an AI-driven method that transforms documents into propositions before chunking. The RAG system’s performance was measured using the RAGAS framework, measuring faithfulness, answer relevancy, context precision, and context recall. Results indicated that the Mistral-7b-instruct-v0.2 model delivered the best performance. The AI-driven chunking method outperformed all other approaches across most evaluation metrics. It achieved the highest scores in faithfulness (0.67), context precision (0.60), and context recall (0.56). Notably, it excelled in context recall, where it achieved a score 0.17 points higher (out of 1) than the second-best method. The optimized prototype was tested by end-users using the chatbot Usability Scale (BUS), obtaining positive feedback regarding functionality, while also highlighted the need for more concise answers and improved response time. This project demonstrates the viability of developing effective chatbots with small LLMs and concludes that AI-assisted corpus pre-processing is a useful method with a relevant and positive impact in optimizing RAG system performance. |
|---|