Detalhes do Documento

Feature Selection on Epistatic Problems Using Genetic Algorithms with Nested Classifiers

Autor(es): Carvalho, Pedro ; Ribeiro, Bruno ; Rodrigues, Nuno M. ; Batista, João E. ; Vanneschi, Leonardo ; Silva, Sara

Data: 2023

Identificador Persistente: http://hdl.handle.net/10362/162072

Origem: Repositório Institucional da UNL

Projeto/bolsa: info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04152%2F2020/PT;

Assunto(s): Feature Selection; Epistasis; Genetic Algorithms; Genetic Programming; Decision Trees; Machine Learning; Genome-Wide Association Studies; Theoretical Computer Science; Computer Science(all)


Descrição

Carvalho, P., Ribeiro, B., Rodrigues, N. M., Batista, J. E., Vanneschi, L., & Silva, S. (2023). Feature Selection on Epistatic Problems Using Genetic Algorithms with Nested Classifiers. In J. Correia, S. Smith, & R. Qaddoura (Eds.), Applications of Evolutionary Computation: 26th European Conference, EvoApplications 2023 Held as Part of EvoStar 2023 Brno, Czech Republic, April 12–14, 2023 Proceedings (pp. 656-671). (Lecture Notes in Computer Science; Vol. 13989). Springer. https://doi.org/10.1007/978-3-031-30229-9_42---This work was supported by FCT, Portugal, through funding of LASIGE Research Unit (UIDB/00408/2020, UIDP/00408/2020) and CISUC (UID/CEC/00326/2020); projects AICE (DSAIPA/DS/0113/2019), from FCT, and RETINA (NORTE-01-0145-FEDER-000062), supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). The authors acknowledge the work facilities and equipment provided by GECAD research center (UIDB/00760/2020) to the project team. The authors were also supported by their respective PhD grants, Pedro Carvalho (UI/BD/151053/2021), Nuno Rodrigues (2021/05322/BD), João Batista (SFRH/BD/143972/2019).

Feature selection is becoming an essential part of machine learning pipelines, including the ones generated by recent AutoML tools. In case of datasets with epistatic interactions between the features, like many datasets from the bioinformatics domain, feature selection may even become crucial. A recent method called SLUG has outperformed the state-of-the-art algorithms for feature selection on a large set of epistatic noisy datasets. SLUG uses genetic programming (GP) as a classifier (learner), nested inside a genetic algorithm (GA) that performs feature selection (wrapper). In this work, we pair GA with different learners, in an attempt to match the results of SLUG with less computational effort. We also propose a new feedback mechanism between the learner and the wrapper to improve the convergence towards the key features. Although we do not match the results of SLUG, we demonstrate the positive effect of the feedback mechanism, motivating additional research in this area to further improve SLUG and other existing feature selection methods.

Tipo de Documento Objeto de conferência
Idioma Inglês
Contribuidor(es) NOVA Information Management School (NOVA IMS); Information Management Research Center (MagIC) - NOVA Information Management School; RUN
facebook logo  linkedin logo  twitter logo 
mendeley logo

Documentos Relacionados

Não existem documentos relacionados.