Publicação

Multiresolution motif discovery in time series

Ver documento

Detalhes bibliográficos
Resumo:Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can't into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at diferent resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-eficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from diferent areas that aim to represent practical applications.
Autores principais:Castro, Nuno Constantino
Outros Autores:Azevedo, Paulo J.
Assunto:Time series Motif discovery Frequent patterns Multiresolution
Ano:2010
País:Portugal
Tipo de documento:comunicação em conferência
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can't into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at diferent resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-eficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from diferent areas that aim to represent practical applications.