Publicação

The use of genetic programming for detecting the incorrect predictions of classification models

Ver documento

Detalhes bibliográficos
Resumo:Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.
Autores principais:Napiórkowska, Adrianna Maria
Assunto:Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
Ano:2020
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
_version_ 1868414157116669952
author Napiórkowska, Adrianna Maria
author_facet Napiórkowska, Adrianna Maria
author_role author
contributor_name_str_mv Vanneschi, Leonardo
RUN
country_str PT
creators_json_txt [{\"Person.name\":\"Napiórkowska, Adrianna Maria\"}]
datacite.contributors.contributor.contributorName.fl_str_mv Vanneschi, Leonardo
RUN
datacite.creators.creator.creatorName.fl_str_mv Napiórkowska, Adrianna Maria
datacite.date.Accepted.fl_str_mv 2020-02-21T00:00:00Z
datacite.date.available.fl_str_mv 2020-03-19T09:37:42Z
datacite.date.embargoed.fl_str_mv 2020-03-19T09:37:42Z
datacite.rights.fl_str_mv http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
datacite.titles.title.fl_str_mv The use of genetic programming for detecting the incorrect predictions of classification models
dc.contributor.none.fl_str_mv Vanneschi, Leonardo
RUN
dc.creator.none.fl_str_mv Napiórkowska, Adrianna Maria
dc.date.Accepted.fl_str_mv 2020-02-21T00:00:00Z
dc.date.available.fl_str_mv 2020-03-19T09:37:42Z
dc.date.embargoed.fl_str_mv 2020-03-19T09:37:42Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://hdl.handle.net/10362/94537
dc.language.none.fl_str_mv eng
dc.rights.cclincense.fl_str_mv http://creativecommons.org/licenses/by/4.0/
dc.rights.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
dc.title.fl_str_mv The use of genetic programming for detecting the incorrect predictions of classification models
dc.type.none.fl_str_mv http://purl.org/coar/resource_type/c_bdcc
description Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.
dirty 0
eu_rights_str_mv openAccess
format masterThesis
fulltext.url.fl_str_mv https://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download
id run_00a5ccfc5fbc1a608ffdac57b9f6a73a
identifier.url.fl_str_mv http://hdl.handle.net/10362/94537
instacron_str unl
institution Universidade Nova de Lisboa
instname_str Universidade Nova de Lisboa
language eng
network_acronym_str run
network_name_str Repositório Institucional da UNL
oai_identifier_str oai:run.unl.pt:10362/94537
organization_str_mv urn:organizationAcronym:unl
person_str_mv Napiórkowska, Adrianna Maria
publishDate 2020
reponame_str Repositório Institucional da UNL
repository_id_str urn:repositoryAcronym:run
service_str_mv urn:repositoryAcronym:run
spelling engpt_PTCompanies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.application/pdfpt_PTThe use of genetic programming for detecting the incorrect predictions of classification modelsNapiórkowska, Adrianna MariaVanneschi, LeonardoHostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.ptURNurn:tid:2024618072020-03-19T09:37:42Z2020-02-212020-02-21T00:00:00ZHandlehttp://hdl.handle.net/10362/94537http://purl.org/coar/access_right/c_abf2open accessMachine LearningExplainable AIPost-processingClassificationGenetic ProgrammingErrors Prediction3127205 bytesliteraturehttp://purl.org/coar/resource_type/c_bdccmaster thesis2020-02-21http://creativecommons.org/licenses/by/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download
spellingShingle The use of genetic programming for detecting the incorrect predictions of classification models
Napiórkowska, Adrianna Maria
Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
status SINGLETON
subject.fl_str_mv Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
title The use of genetic programming for detecting the incorrect predictions of classification models
title_full The use of genetic programming for detecting the incorrect predictions of classification models
title_fullStr The use of genetic programming for detecting the incorrect predictions of classification models
title_full_unstemmed The use of genetic programming for detecting the incorrect predictions of classification models
title_short The use of genetic programming for detecting the incorrect predictions of classification models
title_sort The use of genetic programming for detecting the incorrect predictions of classification models
topic Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
topic_facet Machine Learning
Explainable AI
Post-processing
Classification
Genetic Programming
Errors Prediction
url http://hdl.handle.net/10362/94537
visible 1