Publicação

A framework for increasing the value of predictive data-driven models by enriching problem domain characterization with novel features

Ver documento

Detalhes bibliográficos
Resumo:The need to leverage knowledge through data mining has driven enterprises in a demand for more data. However, there is a gap between the availability of data and the application of extracted knowledge for improving decision support. In fact, more data do not necessarily imply better predictive data-driven marketing models, since it is often the case that the problem domain requires a deeper characterization. Aiming at such characterization, we propose a framework drawn on three feature selection strategies, where the goal is to unveil novel features that can effectively increase the value of data by providing a richer characterization of the problem domain. Such strategies involve encompassing context (e.g., social and economic variables), evaluating past history, and disaggregate the main problem into smaller but interesting subproblems. The framework is evaluated through an empirical analysis for a real bank telemarketing application, with the results proving the benefits of such approach, as the area under the receiver operating characteristic curve increased with each stage, improving previous model in terms of predictive performance.
Autores principais:Moro, Sérgio
Outros Autores:Cortez, Paulo; Rita, Paulo
Assunto:Feature selection Decision support Data mining Telemarketing Bank marketing
Ano:2017
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:The need to leverage knowledge through data mining has driven enterprises in a demand for more data. However, there is a gap between the availability of data and the application of extracted knowledge for improving decision support. In fact, more data do not necessarily imply better predictive data-driven marketing models, since it is often the case that the problem domain requires a deeper characterization. Aiming at such characterization, we propose a framework drawn on three feature selection strategies, where the goal is to unveil novel features that can effectively increase the value of data by providing a richer characterization of the problem domain. Such strategies involve encompassing context (e.g., social and economic variables), evaluating past history, and disaggregate the main problem into smaller but interesting subproblems. The framework is evaluated through an empirical analysis for a real bank telemarketing application, with the results proving the benefits of such approach, as the area under the receiver operating characteristic curve increased with each stage, improving previous model in terms of predictive performance.