Detalhes do Documento

Clinically Relevant Sound-based Features in COVID-19 Identification

Autor(es): Matias, Pedro ; Costa, Joao ; Carreiro, Andre V. ; Gamboa, Hugo ; Sousa, Ines ; Gomez, Pedro ; Sousa, Joana ; Neuparth, Nuno ; Carreiro Martins, Pedro ; Soares, Filipe

Data: 2022

Identificador Persistente: http://hdl.handle.net/10362/144963

Origem: Repositório Institucional da UNL

Assunto(s): COVID-19; data-centric; Databases; Feature extraction; feature extraction; Larynx; Lungs; machine learning; Machine learning; Pandemics; Respiratory system; signal processing; Signal processing; speech; Speech recognition; vocal tract; Computer Science(all); Materials Science(all); Engineering(all); Electrical and Electronic Engineering


Descrição

As long as the COVID-19 pandemic is still active in most countries worldwide, rapid diagnostic continues to be crucial to mitigate the impact of seasonal infection waves. Commercialized rapid antigen self-tests proved they cannot handle the most demanding periods, lacking availability and leading to cost rises. Thus, developing a non-invasive, costless, and more decentralized technology capable of giving people feedback about the COVID-19 infection probability would fill these gaps. This paper explores a sound-based analysis of vocal and respiratory audio data to achieve that objective. This work presents a modular data-centric Machine Learning pipeline for COVID-19 identification from voice and respiratory audio samples. Signals are processed to extract and classify relevant segments that contain informative events, such as coughing or breathing. Temporal, amplitude, spectral, cepstral, and phonetic features are extracted from audio along with available metadata for COVID-19 identification. Audio augmentation and data balancing techniques are used to mitigate class disproportionality. The open-access Coswara and COVID-19 Sounds datasets were used to test the performance of the proposed architecture. Obtained sensitivity scores ranged from 60.00% to 80.00% in Coswara and from 51.43% to 77.14% in COVID-19 Sounds. Although previous works report higher accuracy on COVID-19 detection, this research focused on a data-centric approach by validating the quality of the samples, segmenting the speech events, and exploring interpretable features with physiological meaning. As the pandemic evolves, its lessons must endure, and pipelines such as the proposed one will help prepare new stages where quick and easy disease identification is essential.

Tipo de Documento Artigo científico
Idioma Inglês
Contribuidor(es) LIBPhys-UNL; Faculdade de Ciências e Tecnologia (FCT); NOVA Medical School|Faculdade de Ciências Médicas (NMS|FCM); Comprehensive Health Research Centre (CHRC) - pólo NMS; RUN
facebook logo  linkedin logo  twitter logo 
mendeley logo

Documentos Relacionados

Não existem documentos relacionados.