Document details

Is k Nearest Neighbours Regression Better Than GP?

Author(s): Vanneschi, Leonardo ; Castelli, Mauro ; Manzoni, Luca ; Silva, Sara ; Trujillo, Leonardo

Date: 2020

Persistent ID: http://hdl.handle.net/10362/146560

Origin: Repositório Institucional da UNL

Project/scholarship: info:eu-repo/grantAgreement/FCT/9471 - RIDTI/PTDC%2FCCI-CIF%2F29877%2F2017/PT; info:eu-repo/grantAgreement/FCT/3599-PPCDT/DSAIPA%2FDS%2F0113%2F2019/PT; info:eu-repo/grantAgreement/FCT/3599-PPCDT/DSAIPA%2FDS%2F0022%2F2018/PT; info:eu-repo/grantAgreement/FCT/3599-PPCDT/PTDC%2FCCI-INF%2F29168%2F2017/PT;

Subject(s): Theoretical Computer Science; Computer Science(all)


Description

Vanneschi, L., Castelli, M., Manzoni, L., Silva, S., & Trujillo, L. (2020). Is k Nearest Neighbours Regression Better Than GP? In T. Hu, N. Lourenço, E. Medvet, & F. Divina (Eds.), Genetic Programming - 23rd European Conference, EuroGP 2020, Held as Part of EvoStar 2020, Proceedings (pp. 244-261). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12101 LNCS). Springer. https://doi.org/10.1007/978-3-030-44094-7_16

This work starts from the empirical observation that k nearest neighbours (KNN) consistently outperforms state-of-the-art techniques for regression, including geometric semantic genetic programming (GSGP). However, KNN is a memorization, and not a learning, method, i.e. it evaluates unseen data on the basis of training observations, and not by running a learned model. This paper takes a first step towards the objective of defining a learning method able to equal KNN, by defining a new semantic mutation, called random vectors-based mutation (RVM). GP using RVM, called RVMGP, obtains results that are comparable to KNN, but still needs training data to evaluate unseen instances. A comparative analysis sheds some light on the reason why RVMGP outperforms GSGP, revealing that RVMGP is able to explore the semantic space more uniformly. This finding opens a question for the future: is it possible to define a new genetic operator, that explores the semantic space as uniformly as RVM does, but that still allows us to evaluate unseen instances without using training data?

Document Type Conference object
Language English
Contributor(s) NOVA Information Management School (NOVA IMS); Information Management Research Center (MagIC) - NOVA Information Management School; RUN
facebook logo  linkedin logo  twitter logo 
mendeley logo

Related documents