Publicação
The use of genetic programming for detecting the incorrect predictions of classification models
| Resumo: | Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models. |
|---|---|
| Autores principais: | Napiórkowska, Adrianna Maria |
| Assunto: | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| Ano: | 2020 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| _version_ | 1868414157116669952 |
|---|---|
| author | Napiórkowska, Adrianna Maria |
| author_facet | Napiórkowska, Adrianna Maria |
| author_role | author |
| contributor_name_str_mv | Vanneschi, Leonardo RUN |
| country_str | PT |
| creators_json_txt | [{\"Person.name\":\"Napiórkowska, Adrianna Maria\"}] |
| datacite.contributors.contributor.contributorName.fl_str_mv | Vanneschi, Leonardo RUN |
| datacite.creators.creator.creatorName.fl_str_mv | Napiórkowska, Adrianna Maria |
| datacite.date.Accepted.fl_str_mv | 2020-02-21T00:00:00Z |
| datacite.date.available.fl_str_mv | 2020-03-19T09:37:42Z |
| datacite.date.embargoed.fl_str_mv | 2020-03-19T09:37:42Z |
| datacite.rights.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| datacite.subjects.subject.fl_str_mv | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| datacite.titles.title.fl_str_mv | The use of genetic programming for detecting the incorrect predictions of classification models |
| dc.contributor.none.fl_str_mv | Vanneschi, Leonardo RUN |
| dc.creator.none.fl_str_mv | Napiórkowska, Adrianna Maria |
| dc.date.Accepted.fl_str_mv | 2020-02-21T00:00:00Z |
| dc.date.available.fl_str_mv | 2020-03-19T09:37:42Z |
| dc.date.embargoed.fl_str_mv | 2020-03-19T09:37:42Z |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://hdl.handle.net/10362/94537 |
| dc.language.none.fl_str_mv | eng |
| dc.rights.cclincense.fl_str_mv | http://creativecommons.org/licenses/by/4.0/ |
| dc.rights.none.fl_str_mv | http://purl.org/coar/access_right/c_abf2 |
| dc.subject.none.fl_str_mv | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| dc.title.fl_str_mv | The use of genetic programming for detecting the incorrect predictions of classification models |
| dc.type.none.fl_str_mv | http://purl.org/coar/resource_type/c_bdcc |
| description | Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models. |
| dirty | 0 |
| eu_rights_str_mv | openAccess |
| format | masterThesis |
| fulltext.url.fl_str_mv | https://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download |
| id | run_00a5ccfc5fbc1a608ffdac57b9f6a73a |
| identifier.url.fl_str_mv | http://hdl.handle.net/10362/94537 |
| instacron_str | unl |
| institution | Universidade Nova de Lisboa |
| instname_str | Universidade Nova de Lisboa |
| language | eng |
| network_acronym_str | run |
| network_name_str | Repositório Institucional da UNL |
| oai_identifier_str | oai:run.unl.pt:10362/94537 |
| organization_str_mv | urn:organizationAcronym:unl |
| person_str_mv | Napiórkowska, Adrianna Maria |
| publishDate | 2020 |
| reponame_str | Repositório Institucional da UNL |
| repository_id_str | urn:repositoryAcronym:run |
| service_str_mv | urn:repositoryAcronym:run |
| spelling | engpt_PTCompanies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models.application/pdfpt_PTThe use of genetic programming for detecting the incorrect predictions of classification modelsNapiórkowska, Adrianna MariaVanneschi, LeonardoHostingInstitutionOrganizationalRUNe-mailmailto:run@unl.ptrun@unl.ptURNurn:tid:2024618072020-03-19T09:37:42Z2020-02-212020-02-21T00:00:00ZHandlehttp://hdl.handle.net/10362/94537http://purl.org/coar/access_right/c_abf2open accessMachine LearningExplainable AIPost-processingClassificationGenetic ProgrammingErrors Prediction3127205 bytesliteraturehttp://purl.org/coar/resource_type/c_bdccmaster thesis2020-02-21http://creativecommons.org/licenses/by/4.0/http://purl.org/coar/access_right/c_abf2application/pdffulltexthttps://run.unl.pt/bitstreams/25188ef6-0f26-4224-b302-aa8d109422f7/download |
| spellingShingle | The use of genetic programming for detecting the incorrect predictions of classification models Napiórkowska, Adrianna Maria Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| status | SINGLETON |
| subject.fl_str_mv | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| title | The use of genetic programming for detecting the incorrect predictions of classification models |
| title_full | The use of genetic programming for detecting the incorrect predictions of classification models |
| title_fullStr | The use of genetic programming for detecting the incorrect predictions of classification models |
| title_full_unstemmed | The use of genetic programming for detecting the incorrect predictions of classification models |
| title_short | The use of genetic programming for detecting the incorrect predictions of classification models |
| title_sort | The use of genetic programming for detecting the incorrect predictions of classification models |
| topic | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| topic_facet | Machine Learning Explainable AI Post-processing Classification Genetic Programming Errors Prediction |
| url | http://hdl.handle.net/10362/94537 |
| visible | 1 |