Abstract
Abstract
Introduction
Context Data-driven models (DDMs) are increasingly used for crop yield prediction due to their ability to capture complex patterns and relationships. DDMs rely heavily on data inputs to provide predictions. Despite their effectiveness, DDMs can be complemented by inputs derived from mechanistic models (MMs).
Methods
This study investigated enhancing the predictive quality of DDMs by using as features a combination of MMs outputs, specifically biomass and soil moisture, with conventional data sources like satellite imagery, weather, and soil information. Four experiments were performed with different datasets being used for prediction: Experiment 1 combined MM outputs with conventional data; Experiment 2 excluded MM outputs; Experiment 3 was the same as Experiment 1 but all conventional temporal data were omitted; Experiment 4 utilised solely MM outputs. The research encompassed ten field-years of wheat and chickpea yield data, applying the eXtreme Gradient Boosting (XGBOOST) algorithm for model fitting. Performance was evaluated using root mean square error (RMSE) and the concordance correlation coefficient (CCC).
Results and conclusions
The validation results showed that the XGBOOST model had similar predictive power for both crops in Experiments 1, 2, and 3. For chickpeas, the CCC ranged from 0.89 to 0.91 and the RMSE from 0.23 to 0.25 t ha−1. For wheat, the CCC ranged from 0.87 to 0.92 and the RMSE from 0.29 to 0.35 t ha−1. However, Experiment 4 significantly reduced the model's accuracy, with CCCs dropping to 0.47 for chickpeas and 0.36 for wheat, and RMSEs increasing to 0.46 and 0.65 t ha−1, respectively. Ultimately, Experiments 1, 2, and 3 demonstrated comparable effectiveness, but Experiment 3 is recommended for achieving similar predictive quality with a simpler, more interpretable model using biomass and soil moisture alongside non-temporal conventional features.
Publisher
Springer Science and Business Media LLC
Reference45 articles.
1. Abdi, H., Valentin, D., & Edelman, B. (1999). Neural networks. Sage.
2. Al-Shammari, D. (2022). A comparison between machine learning and simple mechanistic-type models for yield prediction in site-specific crop yield predictions.
3. Al-Shammari, D., Whelan, B. M., Wang, C., Bramley, R. G. V., Fajardo, M., & Bishop, T. F. A. (2021). Impact of spatial resolution on the quality of crop yield predictions for site-specific crop management. Agricultural and Forest Meteorology, 310, 108622. https://doi.org/10.1016/j.agrformet.2021.108622
4. Australia, G. (2015). Digital elevation model (DEM) of Australia derived from LiDAR 5 Metre grid. Commonwealth of Australia and Geoscience Australia.
5. Boegh, E., Soegaard, H., Broge, N., Hasager, C., Jensen, N., Schelde, K., & Thomsen, A. (2002). Airborne multispectral data for quantifying leaf area index, nitrogen concentration, and photosynthetic efficiency in agriculture. Remote Sensing of Environment, 81(2–3), 179–193. https://doi.org/10.1016/S0034-4257(01)00342-X