Publicação

Sensor-Based Yield Prediction in Durum Wheat Under Semi-Arid Conditions Using Machine Learning Across Zadoks Growth Stages

Ver documento

Detalhes bibliográficos
Resumo:Yield prediction in wheat cultivated under semi-arid climatic conditions is gaining increasing importance for sustainable production strategies and decision support systems. In this study, a time-series-based modeling approach was implemented using sensor-based data (SPAD, NSPAD, NDVI, INSEY, and plant height measurements collected at four different Zadoks growth stages (ZD24, ZD30, ZD31, and ZD32). Five different machine learning algorithms (Random Forest, Gradient Boosting, AdaBoost, LightGBM, and XGBoost) were tested individually for each stage, and the model performances were evaluated using statistical metrics such as R2%, RMSE t/ha, and MAE t/ha. Modeling results revealed that the ZD31 stage (first node detectable) was identified as the most successful phase for prediction accuracy, with the XGBoost model achieving the highest R2% score (81.0). In the same model, RMSE and MAE values were calculated as 0.49 and 0.37, respectively. The LightGBM model also showed remarkable performance during the ZD30 stage, achieving an R2% of 78.0, an RMSE of 0.52, and an MAE of 0.40. The SHAP (SHapley Additive exPlanations) method used to interpret feature importance revealed that the NDVI and INSEY indices contributed the most significant values to prediction accuracy for yield. This study demonstrates that phenology-sensitive yield prediction approaches offer high potential for sensor-based digital applications. Furthermore, the integration of timing, model selection, and explainability provided valuable insights for the development of advanced decision support systems.
Autores principais:Rufaioğlu, Süreyya Betül
Outros Autores:Bilgili, Ali Volkan; Savaşlı, Erdinç; Özberk, İrfan; Aydemir, Salih; Ismael, Amjad Mohamed; Kaya, Yunus; Matos-Carvalho, João P.
Assunto:Machine learning Sensor-based data SHAP analysis Wheat Yield prediction Zadoks stages General Earth and Planetary Sciences
Ano:2025
País:Portugal
Tipo de documento:artigo
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:Yield prediction in wheat cultivated under semi-arid climatic conditions is gaining increasing importance for sustainable production strategies and decision support systems. In this study, a time-series-based modeling approach was implemented using sensor-based data (SPAD, NSPAD, NDVI, INSEY, and plant height measurements collected at four different Zadoks growth stages (ZD24, ZD30, ZD31, and ZD32). Five different machine learning algorithms (Random Forest, Gradient Boosting, AdaBoost, LightGBM, and XGBoost) were tested individually for each stage, and the model performances were evaluated using statistical metrics such as R2%, RMSE t/ha, and MAE t/ha. Modeling results revealed that the ZD31 stage (first node detectable) was identified as the most successful phase for prediction accuracy, with the XGBoost model achieving the highest R2% score (81.0). In the same model, RMSE and MAE values were calculated as 0.49 and 0.37, respectively. The LightGBM model also showed remarkable performance during the ZD30 stage, achieving an R2% of 78.0, an RMSE of 0.52, and an MAE of 0.40. The SHAP (SHapley Additive exPlanations) method used to interpret feature importance revealed that the NDVI and INSEY indices contributed the most significant values to prediction accuracy for yield. This study demonstrates that phenology-sensitive yield prediction approaches offer high potential for sensor-based digital applications. Furthermore, the integration of timing, model selection, and explainability provided valuable insights for the development of advanced decision support systems.