Publicação

Spatio-temporal SNN : integrating time and space in the clustering process

Ver documento

Detalhes bibliográficos
Resumo:Spatio-temporal clustering is a new subfield of data mining that is increasingly gaining scientific attention due to the technical advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. This process intends to group objects based in their spatial and temporal similarity helping to discover interesting patterns and correlations in large datasets. One of the main challenges of this area is that there are different types of spatio-temporal data and there is no general approach to treat all these types. Another challenge still unresolved is the ability to integrate several dimensions in the clustering process with a general-purpose approach. Moreover, it was also possible to verify that few works address their implementations under the SNN (Shared Nearest Neighbour) algorithm, which gives the opportunity to propose an innovative extension of this particular algorithm. This work intends to implement in the SNN clustering algorithm the ability to deal with spatio-temporal data allowing the integration of space, time and one or more semantic attributes in the clustering process. In this document, background knowledge about clustering, spatial clustering and spatio-temporal clustering are presented along with a summary of the main approaches followed to cluster spatio-temporal data with different clustering algorithms. Based on those approaches, and in the analysis of their advantages and disadvantages, the boundaries of this work are defined in order to incorporate the space, time and semantic attribute dimensions in the SNN algorithm and thus propose the 4D+SNN approach. The results presented in this work are very promising as the approach proposed is able to identify interesting patterns on spatio-temporal data. Concretely, it can identify clusters taking into account simultaneously the spatial and temporal dimension and it also has good results when adding one or more semantic attributes.
Autores principais:Oliveira, João Ricardo Leite Mota
Assunto:Clustering Density-based clustering Spatio-temporal data Distance function Spatio-temporal clustering
Ano:2013
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Spatio-temporal clustering is a new subfield of data mining that is increasingly gaining scientific attention due to the technical advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. This process intends to group objects based in their spatial and temporal similarity helping to discover interesting patterns and correlations in large datasets. One of the main challenges of this area is that there are different types of spatio-temporal data and there is no general approach to treat all these types. Another challenge still unresolved is the ability to integrate several dimensions in the clustering process with a general-purpose approach. Moreover, it was also possible to verify that few works address their implementations under the SNN (Shared Nearest Neighbour) algorithm, which gives the opportunity to propose an innovative extension of this particular algorithm. This work intends to implement in the SNN clustering algorithm the ability to deal with spatio-temporal data allowing the integration of space, time and one or more semantic attributes in the clustering process. In this document, background knowledge about clustering, spatial clustering and spatio-temporal clustering are presented along with a summary of the main approaches followed to cluster spatio-temporal data with different clustering algorithms. Based on those approaches, and in the analysis of their advantages and disadvantages, the boundaries of this work are defined in order to incorporate the space, time and semantic attribute dimensions in the SNN algorithm and thus propose the 4D+SNN approach. The results presented in this work are very promising as the approach proposed is able to identify interesting patterns on spatio-temporal data. Concretely, it can identify clusters taking into account simultaneously the spatial and temporal dimension and it also has good results when adding one or more semantic attributes.