Publicação
Vehicle industry Big Data analysis using clustering approaches
| Resumo: | Considering a globalized world economy and industry, data analysis and visualization offer enlightening information for decision-making and strategic planning. Data science provides diverse statistical and scientific methods to extract the most value possible from a data set, covering all data preparation, cleaning, aggregation, and manipulation. Machine learning (ML) and Artificial Intelligence (AI) come with it to learn and explore the data, uncovering patterns that cannot be seen with only the analyst experience. This work performs a study exploring clustering methods in a trucks data set of logged inclinations on the roadway, a Big Data problem. With a good clustering, the data becomes key to improve product development and fuel efficiency, since different environment of truck usage can be identified. Knowledge discovery and data mining methods were used, namely the K-means and Fuzzy C-means (FCM) algorithms and compared to a rule-based method called GTA. The evaluation metrics addressed are the sum of squares within clusters, the sum of squares between clusters, and the silhouette index. The proposed approach showed satisfactory results and demonstrated how the ML application could benefit this real world problem, especially the FCM. |
|---|---|
| Autores principais: | Seixas, Lenon Diniz |
| Outros Autores: | Corrêa, Fernanda Cristina; Siqueira, Hugo Valadares; Trojan, Flavio; Afonso, Paulo |
| Assunto: | Clustering Fuzzy C-means K-means Slope Trucks |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | comunicação em conferência |
| Tipo de acesso: | acesso restrito |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | Considering a globalized world economy and industry, data analysis and visualization offer enlightening information for decision-making and strategic planning. Data science provides diverse statistical and scientific methods to extract the most value possible from a data set, covering all data preparation, cleaning, aggregation, and manipulation. Machine learning (ML) and Artificial Intelligence (AI) come with it to learn and explore the data, uncovering patterns that cannot be seen with only the analyst experience. This work performs a study exploring clustering methods in a trucks data set of logged inclinations on the roadway, a Big Data problem. With a good clustering, the data becomes key to improve product development and fuel efficiency, since different environment of truck usage can be identified. Knowledge discovery and data mining methods were used, namely the K-means and Fuzzy C-means (FCM) algorithms and compared to a rule-based method called GTA. The evaluation metrics addressed are the sum of squares within clusters, the sum of squares between clusters, and the silhouette index. The proposed approach showed satisfactory results and demonstrated how the ML application could benefit this real world problem, especially the FCM. |
|---|