2,150
Views
3
CrossRef citations to date
0
Altmetric
Research Article

County-level corn yield prediction using supervised machine learning

, , ORCID Icon, , , , & show all
Article: 2253985 | Received 08 Jun 2023, Accepted 28 Aug 2023, Published online: 05 Sep 2023
 

ABSTRACT

The main objectives of this study are (1) to compare several machine learning models to predict county-level corn yield in the study area and (2) to compare the feasibility of machine learning models for in-season yield prediction. We acquired remotely sensed vegetation indices data from moderate resolution imaging spectroradiometer using the Google Earth Engine (GEE). Vegetation indices for a span of 15 years (2006–2020) were processed and downloaded using GEE for the months corresponding to crop growth (April–October). We compared nine machine learning models to predict county-level corn yield. Furthermore, we analyzed the in-season yield prediction performance using the top three machine learning models. The results show that partial least square regression (PLSR) outperformed other machine learning models for corn yield prediction by achieving the highest training and testing performance. The study area’s top three models for county-level corn yield prediction were PLSR, support vector regression (SVR) and ridge regression. For in-season yield prediction, the SVR model performed comparatively well by achieving testing R2 = 0.875. For in-season corn yield prediction, SVR outperformed other models. The results show that machine learning models can predict both in-season yield (best model R2 = 0.875) and end-of-season yield (best model R2 = 0.861) with satisfactory performance. The results indicate that remote sensing data and machine learning models can be used to predict crop yield before the harvest with decent performance. This can provide useful insights in terms of food security and early decision making related to climate change impacts on food security.

Author contributions

Shahid Nawaz Khan: methodology, software, formal analysis, visualization, data curation, writing – original draft, investigation, validation, writing – review and editing. Abid Nawaz Khan: writing – review and editing. Aqil Tariq: Supervision, investigation, validation, Funding, writing review and editing. Linlin Lu: investigation, validation, Funding, writing review and editing. Naeem Abbas Malik: writing – original draft, investigation, validation, writing – review and editing. Muhammad Umair: validation, writing review and editing. Wesam Atef Hatamleh: writing review and editing. Farah Hanna Zawaideh: writing review and editing. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

The authors extend their appreciation to the researchers supporting project number (RSP2023R384) King Saud University, Riyadh, Saudi Arabia.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The datasets used in this study are available for free. Information about the sources of data is mentioned in Sections 2.1.2–2.1.4.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

National Key R&D Program of China (Project No. 2022YFC3800700). The authors extend their appreciation to the researchers supporting project number (RSP2023R384) King Saud University, Riyadh, Saudi Arabia.