Publicação
Sensor-Based Yield Prediction in Durum Wheat Under Semi-Arid Conditions Using Machine Learning Across Zadoks Growth Stages
| Resumo: | Yield prediction in wheat cultivated under semi-arid climatic conditions is gaining increasing importance for sustainable production strategies and decision support systems. In this study, a time-series-based modeling approach was implemented using sensor-based data (SPAD, NSPAD, NDVI, INSEY, and plant height measurements collected at four different Zadoks growth stages (ZD24, ZD30, ZD31, and ZD32). Five different machine learning algorithms (Random Forest, Gradient Boosting, AdaBoost, LightGBM, and XGBoost) were tested individually for each stage, and the model performances were evaluated using statistical metrics such as R2%, RMSE t/ha, and MAE t/ha. Modeling results revealed that the ZD31 stage (first node detectable) was identified as the most successful phase for prediction accuracy, with the XGBoost model achieving the highest R2% score (81.0). In the same model, RMSE and MAE values were calculated as 0.49 and 0.37, respectively. The LightGBM model also showed remarkable performance during the ZD30 stage, achieving an R2% of 78.0, an RMSE of 0.52, and an MAE of 0.40. The SHAP (SHapley Additive exPlanations) method used to interpret feature importance revealed that the NDVI and INSEY indices contributed the most significant values to prediction accuracy for yield. This study demonstrates that phenology-sensitive yield prediction approaches offer high potential for sensor-based digital applications. Furthermore, the integration of timing, model selection, and explainability provided valuable insights for the development of advanced decision support systems. |
|---|---|
| Autores principais: | Rufaioğlu, Süreyya Betül |
| Outros Autores: | Bilgili, Ali Volkan; Savaşlı, Erdinç; Özberk, İrfan; Aydemir, Salih; Ismael, Amjad Mohamed; Kaya, Yunus; Matos-Carvalho, João P. |
| Assunto: | Machine learning Sensor-based data SHAP analysis Wheat Yield prediction Zadoks stages General Earth and Planetary Sciences |
| Ano: | 2025 |
| País: | Portugal |
| Tipo de documento: | artigo |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Yield prediction in wheat cultivated under semi-arid climatic conditions is gaining increasing importance for sustainable production strategies and decision support systems. In this study, a time-series-based modeling approach was implemented using sensor-based data (SPAD, NSPAD, NDVI, INSEY, and plant height measurements collected at four different Zadoks growth stages (ZD24, ZD30, ZD31, and ZD32). Five different machine learning algorithms (Random Forest, Gradient Boosting, AdaBoost, LightGBM, and XGBoost) were tested individually for each stage, and the model performances were evaluated using statistical metrics such as R2%, RMSE t/ha, and MAE t/ha. Modeling results revealed that the ZD31 stage (first node detectable) was identified as the most successful phase for prediction accuracy, with the XGBoost model achieving the highest R2% score (81.0). In the same model, RMSE and MAE values were calculated as 0.49 and 0.37, respectively. The LightGBM model also showed remarkable performance during the ZD30 stage, achieving an R2% of 78.0, an RMSE of 0.52, and an MAE of 0.40. The SHAP (SHapley Additive exPlanations) method used to interpret feature importance revealed that the NDVI and INSEY indices contributed the most significant values to prediction accuracy for yield. This study demonstrates that phenology-sensitive yield prediction approaches offer high potential for sensor-based digital applications. Furthermore, the integration of timing, model selection, and explainability provided valuable insights for the development of advanced decision support systems. |
|---|