Publicação
Uncertainty-Aware AI for ECG arrhythmia multi-label classification
| Resumo: | Machine Learning (ML) models are able to predict a variety of diseases, with performances that can be superior to those achieved by healthcare professionals. However, when implemented in clinical settings as decision support systems, their generalisation capabilities are often compromised, rendering healthcare professionals more susceptible into delivering erroneous diagnostics. This research focuses on uncertainty measures as a key method to abstain from classifying samples with high uncertainty as well as a selection criterion for active learning strategies. For this purpose, it was employed four large public multi-label Electrocardiogram (ECG) databases for the classification of cardiac arrhythmias. Regarding the uncertainty measures, single distribution uncertainty and classical information-theoretic measures of entropy were tested and compared. Thus, three Deep Learning models were developed: a single convolutional neural network and two multiple-models using Monte-Carlo Dropout and Deep Ensemble techniques. When tested with samples from the same database used for training, all models achieved performances higher than 95% for F1-score. However, when tested on an external dataset, their performances dropped to approximately 70%, indicating a probable scenario of dataset shift. The Deep Ensemble model obtained the highest F1-score in both test sets with a maximum difference of 3% from the others. The classification withrejection option increased from a rejection of10% to a range between 30% to 50% depending on the model or uncertainty measure, with the highest rejection rates being obtained on external data. This reveals that external dataset’s classifications have higher uncertainty, also an indication of dataset shift. For the active learning approach, 10% of the highest uncertainty sampleswere used to retrain the models. The performances results increased by almost 5%, suggesting uncertainty as a good selection method. Although there are still challenges to the implementation of ML models, the preliminary studies show that uncertainty quantification is a valuable method for classification with rejection option and active learning approaches under dataset shift conditions. |
|---|---|
| Autores principais: | Simão, Raquel Filipa Birra |
| Assunto: | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| Ano: | 2022 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| _version_ | 1865920612049879040 |
|---|---|
| author | Simão, Raquel Filipa Birra |
| author_facet | Simão, Raquel Filipa Birra Simão, Raquel Filipa Birra |
| author_role | author |
| contributor_name_str_mv | Gamboa, Hugo RUN |
| country_str | PT |
| creators_json_str | [{\"Person.name\":\"Simão, Raquel Filipa Birra\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Gamboa, Hugo RUN |
| datacite.creators.creator.creatorName.fl_str_mv | Simão, Raquel Filipa Birra |
| datacite.date.Accepted.fl_str_mv | 2022-11-01T00:00:00Z |
| datacite.date.available.fl_str_mv | 2023-09-01T13:03:03Z |
| datacite.date.embargoed.fl_str_mv | 2023-09-01T13:03:03Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| datacite.subjects.subject.fl_str_mv | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| datacite.titles.title.fl_str_mv | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| dc.contributor.none.fl_str_mv | Gamboa, Hugo RUN |
| dc.creator.none.fl_str_mv | Simão, Raquel Filipa Birra |
| dc.date.Accepted.fl_str_mv | 2022-11-01T00:00:00Z |
| dc.date.available.fl_str_mv | 2023-09-01T13:03:03Z |
| dc.date.embargoed.fl_str_mv | 2023-09-01T13:03:03Z |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://hdl.handle.net/10362/157128 |
| dc.language.none.fl_str_mv | eng |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| dc.subject.none.fl_str_mv | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| dc.title.fl_str_mv | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_bdcc |
| description | Machine Learning (ML) models are able to predict a variety of diseases, with performances that can be superior to those achieved by healthcare professionals. However, when implemented in clinical settings as decision support systems, their generalisation capabilities are often compromised, rendering healthcare professionals more susceptible into delivering erroneous diagnostics. This research focuses on uncertainty measures as a key method to abstain from classifying samples with high uncertainty as well as a selection criterion for active learning strategies. For this purpose, it was employed four large public multi-label Electrocardiogram (ECG) databases for the classification of cardiac arrhythmias. Regarding the uncertainty measures, single distribution uncertainty and classical information-theoretic measures of entropy were tested and compared. Thus, three Deep Learning models were developed: a single convolutional neural network and two multiple-models using Monte-Carlo Dropout and Deep Ensemble techniques. When tested with samples from the same database used for training, all models achieved performances higher than 95% for F1-score. However, when tested on an external dataset, their performances dropped to approximately 70%, indicating a probable scenario of dataset shift. The Deep Ensemble model obtained the highest F1-score in both test sets with a maximum difference of 3% from the others. The classification withrejection option increased from a rejection of10% to a range between 30% to 50% depending on the model or uncertainty measure, with the highest rejection rates being obtained on external data. This reveals that external dataset’s classifications have higher uncertainty, also an indication of dataset shift. For the active learning approach, 10% of the highest uncertainty sampleswere used to retrain the models. The performances results increased by almost 5%, suggesting uncertainty as a good selection method. Although there are still challenges to the implementation of ML models, the preliminary studies show that uncertainty quantification is a valuable method for classification with rejection option and active learning approaches under dataset shift conditions. |
| dirty | 0 |
| eu_rights_str_mv | openAccess |
| format | masterThesis |
| fulltext.url.fl_str_mv | https://run.unl.pt/bitstreams/25ee8c28-95e7-4a26-8b7e-aa2521cefeaa/download |
| id | run_4e3bd52bcffecb83eb8a6960698022ba |
| identifier.url.fl_str_mv | http://hdl.handle.net/10362/157128 |
| instacron_str | unl |
| institution | Universidade Nova de Lisboa |
| instname_str | Universidade Nova de Lisboa |
| language | eng |
| network_acronym_str | run |
| network_name_str | Repositório Institucional da UNL |
| oai_identifier_str | oai:run.unl.pt:10362/157128 |
| organization_str_mv | urn:organizationAcronym:unl |
| person_str_mv | Simão, Raquel Filipa Birra |
| publishDate | 2022 |
| reponame_str | Repositório Institucional da UNL |
| repository_id_str | urn:repositoryAcronym:run |
| service_str_mv | urn:repositoryAcronym:run |
| spelling | engpt_PTMachine Learning (ML) models are able to predict a variety of diseases, with performances that can be superior to those achieved by healthcare professionals. However, when implemented in clinical settings as decision support systems, their generalisation capabilities are often compromised, rendering healthcare professionals more susceptible into delivering erroneous diagnostics. This research focuses on uncertainty measures as a key method to abstain from classifying samples with high uncertainty as well as a selection criterion for active learning strategies. For this purpose, it was employed four large public multi-label Electrocardiogram (ECG) databases for the classification of cardiac arrhythmias. Regarding the uncertainty measures, single distribution uncertainty and classical information-theoretic measures of entropy were tested and compared. Thus, three Deep Learning models were developed: a single convolutional neural network and two multiple-models using Monte-Carlo Dropout and Deep Ensemble techniques. When tested with samples from the same database used for training, all models achieved performances higher than 95% for F1-score. However, when tested on an external dataset, their performances dropped to approximately 70%, indicating a probable scenario of dataset shift. The Deep Ensemble model obtained the highest F1-score in both test sets with a maximum difference of 3% from the others. The classification withrejection option increased from a rejection of10% to a range between 30% to 50% depending on the model or uncertainty measure, with the highest rejection rates being obtained on external data. This reveals that external dataset’s classifications have higher uncertainty, also an indication of dataset shift. For the active learning approach, 10% of the highest uncertainty sampleswere used to retrain the models. The performances results increased by almost 5%, suggesting uncertainty as a good selection method. Although there are still challenges to the implementation of ML models, the preliminary studies show that uncertainty quantification is a valuable method for classification with rejection option and active learning approaches under dataset shift conditions.application/pdfpt_PTUncertainty-Aware AI for ECG arrhythmia multi-label classificationSimão, Raquel Filipa BirraGamboa, HugoHostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.pt2023-09-01T13:03:03Z2022-112022-11-01T00:00:00ZHandlehttp://hdl.handle.net/10362/157128http://purl.org/coar/access_right/c_abf2open accessUncertainty QuantificationMonte Carlo DropoutDeep EnsembleDataset shiftActive Learning15763171 bytesliteraturehttp://purl.org/coar/resource_type/c_bdccmaster thesishttp://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/25ee8c28-95e7-4a26-8b7e-aa2521cefeaa/download |
| spellingShingle | Uncertainty-Aware AI for ECG arrhythmia multi-label classification Uncertainty-Aware AI for ECG arrhythmia multi-label classification Simão, Raquel Filipa Birra Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning Simão, Raquel Filipa Birra Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| status | NEW |
| subject.fl_str_mv | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| title | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| title_full | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| title_fullStr | Uncertainty-Aware AI for ECG arrhythmia multi-label classification Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| title_full_unstemmed | Uncertainty-Aware AI for ECG arrhythmia multi-label classification Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| title_short | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| title_sort | Uncertainty-Aware AI for ECG arrhythmia multi-label classification |
| topic | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| topic_facet | Uncertainty Quantification Monte Carlo Dropout Deep Ensemble Dataset shift Active Learning |
| url | http://hdl.handle.net/10362/157128 |
| visible | 1 |