Publicação

The use of genetic programming for detecting the incorrect predictions of classification models

Detalhes bibliográficos
Resumo:	Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.
Autores principais:	Napiórkowska, Adrianna Maria
Assunto:	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
Ano:	2020
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade Nova de Lisboa
Idioma:	inglês
Origem:	Repositório Institucional da UNL

_version_	1868414157116669952
author	Napiórkowska, Adrianna Maria
author_facet	Napiórkowska, Adrianna Maria
author_role	author
contributor_name_str_mv	Vanneschi, Leonardo RUN
country_str	PT
creators_json_txt	[{\"Person.name\":\"Napiórkowska, Adrianna Maria\"}]
datacite.contributors.contributor.contributorName.fl_str_mv	Vanneschi, Leonardo RUN
datacite.creators.creator.creatorName.fl_str_mv	Napiórkowska, Adrianna Maria
datacite.date.Accepted.fl_str_mv	2020-02-21T00:00:00Z
datacite.date.available.fl_str_mv	2020-03-19T09:37:42Z
datacite.date.embargoed.fl_str_mv	2020-03-19T09:37:42Z
datacite.rights.fl_str_mv	http://purl.org/coar/access_right/c_abf2
datacite.subjects.subject.fl_str_mv	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
datacite.titles.title.fl_str_mv	The use of genetic programming for detecting the incorrect predictions of classification models
dc.contributor.none.fl_str_mv	Vanneschi, Leonardo RUN
dc.creator.none.fl_str_mv	Napiórkowska, Adrianna Maria
dc.date.Accepted.fl_str_mv	2020-02-21T00:00:00Z
dc.date.available.fl_str_mv	2020-03-19T09:37:42Z
dc.date.embargoed.fl_str_mv	2020-03-19T09:37:42Z
dc.format.none.fl_str_mv	application/pdf
dc.identifier.none.fl_str_mv	http://hdl.handle.net/10362/94537
dc.language.none.fl_str_mv	eng
dc.rights.cclincense.fl_str_mv	http://creativecommons.org/licenses/by/4.0/
dc.rights.none.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.subject.none.fl_str_mv	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
dc.title.fl_str_mv	The use of genetic programming for detecting the incorrect predictions of classification models
dc.type.none.fl_str_mv	http://purl.org/coar/resource_type/c_bdcc
description	Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.
dirty	0
eu_rights_str_mv	openAccess
format	masterThesis
fulltext.url.fl_str_mv	https://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download
id	run_00a5ccfc5fbc1a608ffdac57b9f6a73a
identifier.url.fl_str_mv	http://hdl.handle.net/10362/94537
instacron_str	unl
institution	Universidade Nova de Lisboa
instname_str	Universidade Nova de Lisboa
language	eng
network_acronym_str	run
network_name_str	Repositório Institucional da UNL
oai_identifier_str	oai:run.unl.pt:10362/94537
organization_str_mv	urn:organizationAcronym:unl
person_str_mv	Napiórkowska, Adrianna Maria
publishDate	2020
reponame_str	Repositório Institucional da UNL
repository_id_str	urn:repositoryAcronym:run
service_str_mv	urn:repositoryAcronym:run
spelling	engpt_PTCompanies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.application/pdfpt_PTThe use of genetic programming for detecting the incorrect predictions of classification modelsNapiórkowska, Adrianna MariaVanneschi, LeonardoHostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.ptURNurn:tid:2024618072020-03-19T09:37:42Z2020-02-212020-02-21T00:00:00ZHandlehttp://hdl.handle.net/10362/94537http://purl.org/coar/access_right/c_abf2open accessMachine LearningExplainable AIPost-processingClassificationGenetic ProgrammingErrors Prediction3127205 bytesliteraturehttp://purl.org/coar/resource_type/c_bdccmaster thesis2020-02-21http://creativecommons.org/licenses/by/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download
spellingShingle	The use of genetic programming for detecting the incorrect predictions of classification models Napiórkowska, Adrianna Maria Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
status	SINGLETON
subject.fl_str_mv	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
title	The use of genetic programming for detecting the incorrect predictions of classification models
title_full	The use of genetic programming for detecting the incorrect predictions of classification models
title_fullStr	The use of genetic programming for detecting the incorrect predictions of classification models
title_full_unstemmed	The use of genetic programming for detecting the incorrect predictions of classification models
title_short	The use of genetic programming for detecting the incorrect predictions of classification models
title_sort	The use of genetic programming for detecting the incorrect predictions of classification models
topic	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
topic_facet	Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction
url	http://hdl.handle.net/10362/94537
visible	1

Publicação

The use of genetic programming for detecting the incorrect predictions of classification models

Registos relacionados