Document details

Analysis of the use of repulsors to improve generalization ability in genetic programming : an application to symbolic regression problems

Author(s): Canelhas, Jorge Miguel Silvestre

Date: 2018

Persistent ID: http://hdl.handle.net/10362/28931

Origin: Repositório Institucional da UNL

Subject(s): Genetic algorithm; Genetic programming; Machine learning; Repulsor; Overfitting; Symbolic regression


Description

Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies Management

Genetic Algorithms are bio-inspired metaheuristics that solve optimization problems; they are evolutionary algorithms that mimic the biological processes of evolution and natural selection to evolve solutions to a given problem. Genetic programming consists of the creation of programs employing GAs to evolve them. In both GA and GP, the algorithm starts with a random solution to a problem that is improved generation after generation building it on the positive traits of the previous generation and discarding traits that do not improve the solution. Repulsors consist of giving the learning algorithm some prior knowledge on the outcome of previous generations on a test set, to try to replace solutions that performed poorly on the data set with better ones. This thesis aims to test and document if the use of repulsors can change the behavior of GP, improve its learning rate and reduce overfitting thus also improving the generalization abilities? Overfitting is a problem in many machine learning algorithms, genetic programming (GP) is also affected by it, one of the objectives of this dissertation is to assess if overfitting can be reduced by using knowledge on the prior behavior of programs generated by GP on a validation data set, and, applying this knowledge to change the selection phase penalizing solutions similar to those that generalized poorly before. These poorly performing solutions will be called repulsors and are the main topic of this dissertation. We developed a program that implemented standard and repulsor based genetic programming. The program was then executed several times over some datasets and collect the results. Finally, the results were compared, and conclusions were taken. The results indicate that the use of repulsors produces better results on the training set and in the test set, this leads us to conclude that the use of repulsors has a positive effect on the performance of GP. The results indicate that the use of repulsors does indeed produce better results. On the training phase, seven out of the nine datasets showed improved algorithm performance when learning. In the test sets, the algorithm presented better generalization ability on five out of nine datasets. Studies could be extended to the use of multi-objective optimization when selecting individuals, and the extension of the repulsor list to other (independent) runs with the same parameters and dataset.

Document Type Master thesis
Language English
Advisor(s) Vanneschi, Leonardo
Contributor(s) RUN
facebook logo  linkedin logo  twitter logo 
mendeley logo

Related documents

No related documents