Publicação

Impact of different lag-days on predicting one-week and two-weeks cow milk yield using Automated Milking Systems data

Ver documento

Detalhes bibliográficos
Resumo:This study takes advantage of predictive modeling in dairy farming by addressing key gaps identified in the literature, focusing on forecasting one-week and two-weeks-ahead average milk yield using data from Automated Milking Systems (AMSs). By examining the temporal impacts, more specifically how it affects the predictions, and primiparous and multiparous variability, robust models for milk production were developed using data from multiple farms. Three different machine learning methods were employed: Extreme Gradient Boosting (XGBoost), Artificial Neural Networks (ANNs), and Genetic Programming (GP). XGBoost and ANN, were chosen based on their established effectiveness, while GP introduced a novel approach in the field. The analysis considered seven different lag periods (1 to 7 days) of data to evaluate the importance of historical data in prediction accuracy for both one-week and two-weeks-ahead predictions. The data was also divided by the lactation group (primiparous and multiparous), resulting in 28 different datasets. The findings indicated strong predictive performance, particularly for one-week-ahead forecasts. As expected, the performance decreased for two-weeks-ahead predictions, underscoring the challenges of longer-term forecasting. XGBoost demonstrated the best predictive performance, achieving, overall, the lowest RMSE values on unseen data, while GP showed the weakest performance and ANN yielded variable results. The study revealed that increased lag data significantly enhanced XGBoost’s predictions, whereas ANN and GP showed more variable impacts from lag periods. Despite XGBoost’s overall superior accuracy, it exhibited substantial overfitting when trained on primiparous data and tested on multiparous cows. On the contrary, ANN and GP showed a better generalization. The GP approach, though presenting lower predictive ability, enabled the identification of the most important features for predictions. The milk yield, number of days in lactation, and number of milking failures from the previous days were particularly important across all models. The unexpected prominence of the number of milking failures suggests its critical role in forecasting milk yield, offering new insights for farm management and animal welfare.
Autores principais:Morais, Patrícia Alexandra dos Santos
Assunto:AMS Milk Yield Machine Learning Genetic Programming SDG 9 - Industry, innovation and infrastructure SDG 15 - Life on land
Ano:2024
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:This study takes advantage of predictive modeling in dairy farming by addressing key gaps identified in the literature, focusing on forecasting one-week and two-weeks-ahead average milk yield using data from Automated Milking Systems (AMSs). By examining the temporal impacts, more specifically how it affects the predictions, and primiparous and multiparous variability, robust models for milk production were developed using data from multiple farms. Three different machine learning methods were employed: Extreme Gradient Boosting (XGBoost), Artificial Neural Networks (ANNs), and Genetic Programming (GP). XGBoost and ANN, were chosen based on their established effectiveness, while GP introduced a novel approach in the field. The analysis considered seven different lag periods (1 to 7 days) of data to evaluate the importance of historical data in prediction accuracy for both one-week and two-weeks-ahead predictions. The data was also divided by the lactation group (primiparous and multiparous), resulting in 28 different datasets. The findings indicated strong predictive performance, particularly for one-week-ahead forecasts. As expected, the performance decreased for two-weeks-ahead predictions, underscoring the challenges of longer-term forecasting. XGBoost demonstrated the best predictive performance, achieving, overall, the lowest RMSE values on unseen data, while GP showed the weakest performance and ANN yielded variable results. The study revealed that increased lag data significantly enhanced XGBoost’s predictions, whereas ANN and GP showed more variable impacts from lag periods. Despite XGBoost’s overall superior accuracy, it exhibited substantial overfitting when trained on primiparous data and tested on multiparous cows. On the contrary, ANN and GP showed a better generalization. The GP approach, though presenting lower predictive ability, enabled the identification of the most important features for predictions. The milk yield, number of days in lactation, and number of milking failures from the previous days were particularly important across all models. The unexpected prominence of the number of milking failures suggests its critical role in forecasting milk yield, offering new insights for farm management and animal welfare.