Author(s):
Narotamo, Hemaxi ; Dias, Mariana ; Santos, Ricardo ; Carreiro, André V. ; Gamboa, Hugo ; Silveira, Margarida
Date: 2024
Persistent ID: http://hdl.handle.net/10362/178851
Origin: Repositório Institucional da UNL
Project/scholarship:
info:eu-repo/grantAgreement/FCT/OE/SFRH%2FBD%2F151375%2F2021/PT;
info:eu-repo/grantAgreement/FCT//2020.04511.BD/PT;
Subject(s): Cardiovascular diseases; Convolutional neural networks; Deep learning; Electrocardiogram classification; Multimodal artificial intelligence; Recurrent neural networks; Signal Processing; Biomedical Engineering; Health Informatics; SDG 3 - Good Health and Well-being
Description
Publisher Copyright: © 2024 The Author(s)
The improved diagnosis of cardiovascular diseases (CVD) from electrocardiograms (ECG) may help prevent their severity. Since Deep Learning (DL) became popular, several DL methods have been developed for ECG classification. In this work, we compare how different methods for ECG signal representation perform in the multi-label classification of CVDs, including recent attention-based strategies. Furthermore, multimodal fusion strategies are employed to improve the prediction capacity of individual representation networks. The publicly available PTB-XL ECG dataset, which contains 21,837 records and labels for the diagnosis of 4 CVDs, was used for the task. Two DL strategies using different processing approaches were compared. Recurrent Neural Network-based models take advantage of the temporal dependence between raw signal values, namely through Gated Recurrent Unit (GRU), Long Short Term Memory (LSTM) and 1D-Convolutional Neural Network models. Additionally, the raw ECG was converted into image representations, based on recent work, and the classification was performed using distinct 2D-Convolutional Neural Networks. The potential of multimodal DL was then studied through early, late and joint data fusion strategies, to evaluate the benefit of resorting to multiple representations. Results based on the 1D ECG representation outperform image-based approaches and multimodal models. The best model, GRU, achieved sensitivity and specificity of 79.67% and 81.04%, respectively.