Publicação

A comprehensive study on personal and medical information to predict diabetes

Ver documento

Detalhes bibliográficos
Resumo:Diabetes mellitus is without a doubt one of the most wellknown and prevalent diseases in people’s daily lives. Creating a tool that can predict the disease would benefit professionals and healthcare systems alike, benefiting both families and countries’ economies in general. Data Mining can be a useful factor in the development of this predictive tool. Data was explored in this study in order to determine which attributes, techniques, and approaches can effectively improve this predictive objective. The main approaches to investigating the data using CRISP-DM were classification and association rules, a methodology that allows searching and finding hidden patterns and relations within data. Results obtained and represented show sensitivity and accuracy values higher than 70%, using J48 and SVM classification algorithms, and allowed to examine that social-economical attributes are not enough to illness prediction. The same applies when only those most indicative characteristics are used - i.e. physical activity, healthy eating and lifestyle, regular health exams - which indicates that a greater set of information is needed so as to be designed an effective model. The best results were obtained using J48 and SVM classification techniques.
Autores principais:Pimenta, Nuno
Outros Autores:Sousa, Regina; Peixoto, Hugo; Machado, José Manuel
Assunto:Diabetes mellitus Machine learning Prediction models Data mining Association rules
Ano:2023
País:Portugal
Tipo de documento:comunicação em conferência
Tipo de acesso:acesso restrito
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Diabetes mellitus is without a doubt one of the most wellknown and prevalent diseases in people’s daily lives. Creating a tool that can predict the disease would benefit professionals and healthcare systems alike, benefiting both families and countries’ economies in general. Data Mining can be a useful factor in the development of this predictive tool. Data was explored in this study in order to determine which attributes, techniques, and approaches can effectively improve this predictive objective. The main approaches to investigating the data using CRISP-DM were classification and association rules, a methodology that allows searching and finding hidden patterns and relations within data. Results obtained and represented show sensitivity and accuracy values higher than 70%, using J48 and SVM classification algorithms, and allowed to examine that social-economical attributes are not enough to illness prediction. The same applies when only those most indicative characteristics are used - i.e. physical activity, healthy eating and lifestyle, regular health exams - which indicates that a greater set of information is needed so as to be designed an effective model. The best results were obtained using J48 and SVM classification techniques.