Publicação

Optimal leverage association rules with numerical interval conditions

Detalhes bibliográficos
Resumo:	In this paper we propose a framework for defining and discovering optimal association rules involving a numerical attribute A in the consequent. The consequent has the form of interval conditions A, A≥ x or A ∈ I where I is an interval or a set of intervals of the form [x_l,x_u. The optimality is with respect to leverage, one well known association rule interest measure. The generated rules are called Maximal Leverage Rules MLR and are generated from Distribution Rules. The principle for finding the MLR is related to the Kolmogorov-Smirnov goodness of fit statistical test. We propose different methods for MLR generation, taking into account leverage optimallity and readability. We theoretically demonstrate the optimality of the main exact methods, and measure the leverage loss of approximate methods. We show empirically that the discovery process is scalable.
Autores principais:	Jorge, Alípio M.
Outros Autores:	Azevedo, Paulo J.
Assunto:	Numerical association rules Leverage Optimal association rules. Distribution rules
Ano:	2012
País:	Portugal
Tipo de documento:	artigo
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade do Minho
Idioma:	inglês
Origem:	RepositóriUM - Universidade do Minho

Descrição
Resumo:	In this paper we propose a framework for defining and discovering optimal association rules involving a numerical attribute A in the consequent. The consequent has the form of interval conditions A, A≥ x or A ∈ I where I is an interval or a set of intervals of the form [x_l,x_u. The optimality is with respect to leverage, one well known association rule interest measure. The generated rules are called Maximal Leverage Rules MLR and are generated from Distribution Rules. The principle for finding the MLR is related to the Kolmogorov-Smirnov goodness of fit statistical test. We propose different methods for MLR generation, taking into account leverage optimallity and readability. We theoretically demonstrate the optimality of the main exact methods, and measure the leverage loss of approximate methods. We show empirically that the discovery process is scalable.