Publicação

Speaker recognition for door opening systems

Detalhes bibliográficos
Resumo:	Besides being an important communication tool, the voice can also serve for identification purposes since it has an individual signature for each person. Speaker recognition technologies can use this signature as an authentication method to access environments. This work explores the development and testing of machine and deep learning models, specifically the GMM, the VGG-M, and ResNet50 models, for speaker recognition access control to build a system to grant access to CeDRI’s laboratory. The deep learning models were evaluated based on their performance in recognizing speakers from audio samples, emphasizing the Equal Error Rate metric to determine their effectiveness. The models were trained and tested initially in public datasets with 1251 to 6112 speakers and then fine-tuned on private datasets with 32 speakers of CeDri’s laboratory. In this study, we compared the performance of ResNet50, VGGM, and GMM models for speaker verification. After conducting experiments on our private datasets, we found that the ResNet50 model outperformed the other models. It achieved the lowest Equal Error Rate (EER) of 0.7% on the Framed Silence Removed dataset. On the same dataset,« the VGGM model achieved an EER of 5%, and the GMM model achieved an EER of 2.13%. Our best model’s performance was unable to achieve the current state-of-the-art of 2.87% in the VoxCeleb 1 verification dataset. However, our best implementation using ResNet50 achieved an EER of 5.96% while being trained on only a tiny portion of the data than it usually is. So, this result indicates that our model is robust and efficient and provides a significant improvement margin. This thesis provides insights into the capabilities of these models in a real-world application, aiming to deploy the system on a platform for practical use in laboratory access authorization. The results of this study contribute to the field of biometric security by demonstrating the potential of speaker recognition systems in controlled environments.
Autores principais:	Manfron, Enrico
Assunto:	Besides Communication tool Deep learning model
Ano:	2023
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Instituto Politécnico de Bragança
Idioma:	inglês
Origem:	Biblioteca Digital do IPB

Registos relacionados

Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets
por: Manfron, Enrico
Publicado em: (2024)

Speaker recognitionin door access control system
por: Manfron, E.
Publicado em: (2023)

Speaker verification on small datasets with ResNet50
por: Manfron, Enrico
Publicado em: (2024)

Deep learning model for doors detection a contribution for context awareness recognition of patients with Parkinson’s disease
por: Gonçalves, Helena Raquel Gouveia Silva
Publicado em: (2023)

Smart Monitor Health System: Face Expressions Recognition
por: Laranjeira, Ana Filipa
Publicado em: (2016)

Super-resolution face recognition: an approach using generative adversarial networks and joint-learn
por: Oliveira, Rafael Augusto de
Publicado em: (2022)

BioTMPy: a Deep Learning-based tool to classify biomedical literature
por: Alves, Nuno
Publicado em: (2021)

Deep learning models for atypical serotonergic cells recognition
por: Corradetti, Daniele
Publicado em: (2024)

The role of background colour in pollen recognition task using CNN
por: Monteiro, Fernando C.
Publicado em: (2021)

Pollen grain recognition through deep learning convolutional neural networks
por: Monteiro, Fernando C.
Publicado em: (2022)

Improving deep learning face recognition for ID and travel document applications with quality assessment
por: Tremoço, João Francisco Gomes
Publicado em: (2021)

Towards precise recognition of pollen bearing bees by convolutional neural networks
por: Monteiro, Fernando C.
Publicado em: (2021)

A Comparison Study of Deep Learning Methodologies for Music Emotion Recognition
por: Louro, Pedro Lima
Publicado em: (2024)

Modelling a Deep Learning Framework for recognition of human actions on video
por: Santos, Flávio
Publicado em: (2021)

Deep learning and machine learning techniques applied to speaker identification on small datasets
por: Manfron, E.
Publicado em: (2023)

The importance of blog as a communication tool to support the development of project-based learning
por: Vicente, Sérgio
Publicado em: (2014)

Deep learning recognition of a large number of pollen grain types
por: Monteiro, Fernando C.
Publicado em: (2021)

Deep learning recognition of a large number of pollen grain types
por: Monteiro, Fernando C.
Publicado em: (2021)

Human activity recognition for indoor localization using smartphone inertial sensors
por: Moreira, Dinis
Publicado em: (2021)

Survey on Deep Fuzzy Systems in Regression Applications: A View on Interpretability
por: Júnior, Jorge S. S.
Publicado em: (2023)

Interpretability of a deep learning model for rodents brain semantic segmentation
por: Matos, Leonardo Nogueira
Publicado em: (2019)

Deep Face Recognition for Online Student Identification
por: Carreira, David Alexandre Mendes
Publicado em: (2023)

Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns
por: Cova, Tânia F. G. G.
Publicado em: (2019)

A tutorial on automatic hyperparameter tuning of deep spectral modelling for regression and classification tasks
por: Passos, Dário
Publicado em: (2022)

Comparative assessment of protein large language models for enzyme commission number prediction
por: Capela, João
Publicado em: (2025)

Deep learning techniques for grapevine variety classification using natural images
por: Pereira, Carlos Manuel Silva
Publicado em: (2020)

Emotion recognition in multimedia content
por: Condesso, Sofia Fernandes
Publicado em: (2025)

Techniques to reject atypical patterns
por: Lopes, Júlio Castro
Publicado em: (2022)

Adversarial Attacks to Classification Systems
por: Leal, João Miguel Gouveia
Publicado em: (2022)

A survey on the semi supervised learning paradigm in the context of speech emotion recognition
por: Andrade, Guilherme
Publicado em: (2022)

Integration of convolutional and adversarial networks into building design: A review
por: Parente, Jean
Publicado em: (2023)

Development of license plate detection and recognition system based on deep learning
por: Colaço, Bruno Tiago Campos
Publicado em: (2024)

MERGE Audio: Music Emotion Recognition next Generation – Audio Classification with Deep Learning
por: Sá, Pedro Marques Alegre de
Publicado em: (2021)

Obtaining deep learning models for automatic classification of leukocytes
por: Rodrigues, Pedro João
Publicado em: (2020)

Vacancy state detector oriented to convolutional neural network, background subtraction and embedded systems
por: Corrêa, Isabelle de Moura
Publicado em: (2019)

Robotics and entrepreneurship for a better society: opening doors to mobility
por: Martins, V.
Publicado em: (2016)

Large Language Models na Extração e Interpretação de Informação Clínica em Texto Livre
por: Pais, Adelino Cristóvão
Publicado em: (2025)

Optimizing Olive Disease Classification Through Hybrid Machine Learning and Deep Learning Techniques
por: Mendes, João
Publicado em: (2024)

Smart embedded system for skin cancer classification
por: Durães, Pedro F. F.
Publicado em: (2023)

Digital communication of museums in Porto and Northern Portugal
por: Cascais, Elisabete
Publicado em: (2025)