Author(s):
Mestre, Ricardo Jorge Palheira
Date: 2013
Persistent ID: http://hdl.handle.net/10362/10923
Origin: Repositório Institucional da UNL
Subject(s): Artificial intelligence; Classification algorithms; KNN; K-nearest neighbor algorithm; Lazy-learning; Eager-learning
Description
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
The object classification is an important area within the artificial intelligence and its application extends to various areas, whether or not in the branch of science. Among the other classifiers, the K-nearest neighbor (KNN) is among the most simple and accurate especially in environments where the data distribution is unknown or apparently not parameterizable. This algorithm assigns the classifying element the major class in the K nearest neighbors. According to the original algorithm, this classification implies the calculation of the distances between the classifying instance and each one of the training objects. If on the one hand, having an extensive training set is an element of importance in order to obtain a high accuracy, on the other hand, it makes the classification of each object slower due to its lazy-learning algorithm nature. Indeed, this algorithm does not provide any means of storing information about the previous calculated classifications,making the calculation of the classification of two equal instances mandatory. In a way, it may be said that this classifier does not learn. This dissertation focuses on the lazy-learning fragility and intends to propose a solution that transforms the KNNinto an eager-learning classifier. In other words, it is intended that the algorithm learns effectively with the training set, thus avoiding redundant calculations. In the context of the proposed change in the algorithm, it is important to highlight the attributes that most characterize the objects according to their discriminating power. In this framework, there will be a study regarding the implementation of these transformations on data of different types: continuous and/or categorical.