Publicação

Vehicle industry Big Data analysis using clustering approaches

Detalhes bibliográficos
Resumo:	Considering a globalized world economy and industry, data analysis and visualization offer enlightening information for decision-making and strategic planning. Data science provides diverse statistical and scientific methods to extract the most value possible from a data set, covering all data preparation, cleaning, aggregation, and manipulation. Machine learning (ML) and Artificial Intelligence (AI) come with it to learn and explore the data, uncovering patterns that cannot be seen with only the analyst experience. This work performs a study exploring clustering methods in a trucks data set of logged inclinations on the roadway, a Big Data problem. With a good clustering, the data becomes key to improve product development and fuel efficiency, since different environment of truck usage can be identified. Knowledge discovery and data mining methods were used, namely the K-means and Fuzzy C-means (FCM) algorithms and compared to a rule-based method called GTA. The evaluation metrics addressed are the sum of squares within clusters, the sum of squares between clusters, and the silhouette index. The proposed approach showed satisfactory results and demonstrated how the ML application could benefit this real world problem, especially the FCM.
Autores principais:	Seixas, Lenon Diniz
Outros Autores:	Corrêa, Fernanda Cristina; Siqueira, Hugo Valadares; Trojan, Flavio; Afonso, Paulo
Assunto:	Clustering Fuzzy C-means K-means Slope Trucks
Ano:	2024
País:	Portugal
Tipo de documento:	comunicação em conferência
Tipo de acesso:	acesso restrito
Instituição associada:	Universidade do Minho
Idioma:	inglês
Origem:	RepositóriUM - Universidade do Minho

Descrição
Resumo:	Considering a globalized world economy and industry, data analysis and visualization offer enlightening information for decision-making and strategic planning. Data science provides diverse statistical and scientific methods to extract the most value possible from a data set, covering all data preparation, cleaning, aggregation, and manipulation. Machine learning (ML) and Artificial Intelligence (AI) come with it to learn and explore the data, uncovering patterns that cannot be seen with only the analyst experience. This work performs a study exploring clustering methods in a trucks data set of logged inclinations on the roadway, a Big Data problem. With a good clustering, the data becomes key to improve product development and fuel efficiency, since different environment of truck usage can be identified. Knowledge discovery and data mining methods were used, namely the K-means and Fuzzy C-means (FCM) algorithms and compared to a rule-based method called GTA. The evaluation metrics addressed are the sum of squares within clusters, the sum of squares between clusters, and the silhouette index. The proposed approach showed satisfactory results and demonstrated how the ML application could benefit this real world problem, especially the FCM.