Document details

Feature Selection on Epistatic Problems Using Genetic Algorithms with Nested Classifiers

Author(s): Carvalho, Pedro ; Ribeiro, Bruno ; Rodrigues, Nuno M. ; Batista, João E. ; Vanneschi, Leonardo ; Silva, Sara

Date: 2023

Persistent ID: http://hdl.handle.net/10362/162072

Origin: Repositório Institucional da UNL

Project/scholarship: info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04152%2F2020/PT;

Subject(s): Feature Selection; Epistasis; Genetic Algorithms; Genetic Programming; Decision Trees; Machine Learning; Genome-Wide Association Studies; Theoretical Computer Science; Computer Science(all)


Description

Carvalho, P., Ribeiro, B., Rodrigues, N. M., Batista, J. E., Vanneschi, L., & Silva, S. (2023). Feature Selection on Epistatic Problems Using Genetic Algorithms with Nested Classifiers. In J. Correia, S. Smith, & R. Qaddoura (Eds.), Applications of Evolutionary Computation: 26th European Conference, EvoApplications 2023 Held as Part of EvoStar 2023 Brno, Czech Republic, April 12–14, 2023 Proceedings (pp. 656-671). (Lecture Notes in Computer Science; Vol. 13989). Springer. https://doi.org/10.1007/978-3-031-30229-9_42---This work was supported by FCT, Portugal, through funding of LASIGE Research Unit (UIDB/00408/2020, UIDP/00408/2020) and CISUC (UID/CEC/00326/2020); projects AICE (DSAIPA/DS/0113/2019), from FCT, and RETINA (NORTE-01-0145-FEDER-000062), supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). The authors acknowledge the work facilities and equipment provided by GECAD research center (UIDB/00760/2020) to the project team. The authors were also supported by their respective PhD grants, Pedro Carvalho (UI/BD/151053/2021), Nuno Rodrigues (2021/05322/BD), João Batista (SFRH/BD/143972/2019).

Feature selection is becoming an essential part of machine learning pipelines, including the ones generated by recent AutoML tools. In case of datasets with epistatic interactions between the features, like many datasets from the bioinformatics domain, feature selection may even become crucial. A recent method called SLUG has outperformed the state-of-the-art algorithms for feature selection on a large set of epistatic noisy datasets. SLUG uses genetic programming (GP) as a classifier (learner), nested inside a genetic algorithm (GA) that performs feature selection (wrapper). In this work, we pair GA with different learners, in an attempt to match the results of SLUG with less computational effort. We also propose a new feedback mechanism between the learner and the wrapper to improve the convergence towards the key features. Although we do not match the results of SLUG, we demonstrate the positive effect of the feedback mechanism, motivating additional research in this area to further improve SLUG and other existing feature selection methods.

Document Type Conference object
Language English
Contributor(s) NOVA Information Management School (NOVA IMS); Information Management Research Center (MagIC) - NOVA Information Management School; RUN
facebook logo  linkedin logo  twitter logo 
mendeley logo

Related documents

No related documents