Publicação
Human-Robot Communication Interface based on Generative and Natural Language Models
| Resumo: | The interaction between humans and artificial intelligence has been extensively explored since the beginning of its development. Not only have its capabilities to assist in everyday tasks across various fields been highlighted, but the investigation of its impact on human social life and its potential contributions, such as in therapeutic settings, has also been emphasized. Nevertheless, a barrier to the complete coexistence of robots and humans that persists is the restricted communication flexibility between the two entities, which significantly limits their cooperative abilities. In recent years, AI platforms based on large language models have emerged, with ChatGPT being the most well-known and widely used for a variety of purposes, ranging from research to composition diverse documents, characterized by its "human-like" response style.In this dissertation, we explore the capabilities of this platform in conjunction with the development of voice and image recognition systems, aiming to create an interface that adapts both to the user and the surrounding environment. The integration of voice and space recognition enables the robotic system to tailor its responses specific interactions, leveraging ChatGPT's interpretative and text generation capabilities. This enriches the relationship fostered during the interaction, resulting in well-crafted responses. Responses are given entirely through vocal means, striving to emulate a "human-like" interaction as much as possible. Additionally, facial expressions using the LED panel on the robot's head were incorporated, exploring the extension of communication using non-verbal forms.The development of such a system considerable potential across various applications, given its ability to respond in a personalized manner to users and their surroundings. ChatGPT facilitates the generation of responses that mirror communication styles specific to particular target demographics, thereby aiding in fostering desired interactions. Furthermore, it permits easy adjustments in the context provided to the platform for an initial context and scenario establishment. The interface was developed, tested, and it's functionality was validated, revealing that communication was engaging and easily established, with polite and informative responses, with good response speed. |
|---|---|
| Autores principais: | Serra, Inês Margarida Silva |
| Assunto: | Social-Robots Interação Humano-Robô ROS Robôs móveis Processamento de Imagem Social-Robots Human-Robot Interaction ROS Mobile Robots Imagem processing |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade de Coimbra |
| Idioma: | inglês |
| Origem: | Estudo Geral - Universidade de Coimbra |
| Resumo: | The interaction between humans and artificial intelligence has been extensively explored since the beginning of its development. Not only have its capabilities to assist in everyday tasks across various fields been highlighted, but the investigation of its impact on human social life and its potential contributions, such as in therapeutic settings, has also been emphasized. Nevertheless, a barrier to the complete coexistence of robots and humans that persists is the restricted communication flexibility between the two entities, which significantly limits their cooperative abilities. In recent years, AI platforms based on large language models have emerged, with ChatGPT being the most well-known and widely used for a variety of purposes, ranging from research to composition diverse documents, characterized by its "human-like" response style.In this dissertation, we explore the capabilities of this platform in conjunction with the development of voice and image recognition systems, aiming to create an interface that adapts both to the user and the surrounding environment. The integration of voice and space recognition enables the robotic system to tailor its responses specific interactions, leveraging ChatGPT's interpretative and text generation capabilities. This enriches the relationship fostered during the interaction, resulting in well-crafted responses. Responses are given entirely through vocal means, striving to emulate a "human-like" interaction as much as possible. Additionally, facial expressions using the LED panel on the robot's head were incorporated, exploring the extension of communication using non-verbal forms.The development of such a system considerable potential across various applications, given its ability to respond in a personalized manner to users and their surroundings. ChatGPT facilitates the generation of responses that mirror communication styles specific to particular target demographics, thereby aiding in fostering desired interactions. Furthermore, it permits easy adjustments in the context provided to the platform for an initial context and scenario establishment. The interface was developed, tested, and it's functionality was validated, revealing that communication was engaging and easily established, with polite and informative responses, with good response speed. |
|---|