DefinePK

DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.

Comparative Analysis of Machine Learning Models for Crop Yield Prediction Using Categorical and Numerical Agro-Meteorological Data


Article Information

Title: Comparative Analysis of Machine Learning Models for Crop Yield Prediction Using Categorical and Numerical Agro-Meteorological Data

Authors: Shweta Jha, P.R. Patil, H.K. Nemade

Journal: Journal of Neonatal Surgery

HEC Recognition History
Category From To
Y 2023-07-01 2024-09-30
Y 2022-07-01 2023-06-30

Publisher: EL-MED-Pub Publishers

Country: Pakistan

Year: 2025

Volume: 14

Issue: 20S

Language: en

Keywords: MAE

Categories

Abstract

Accurate crop yield prediction plays a vital role in ensuring food security, optimizing agricultural planning, and enabling efficient resource allocation. With the increasing availability of agricultural datasets, machine learning and deep learning techniques have emerged as powerful tools for forecasting crop yields based on historical and agro-climatic data. This study presents a comprehensive comparative analysis of five prominent regression models—Deep Learning (Artificial Neural Networks), Linear Regression, Random Forest Regressor, Gradient Boosting Regressor, and Support Vector Regressor—for crop yield prediction. The dataset used in this study comprises a combination of categorical features (crop type, state, season, year) and numerical attributes (area and production), which were appropriately encoded and scaled for model training.
Model performance was rigorously evaluated using standard regression metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R²). The results reveal that the deep learning model significantly outperformed all traditional regression approaches, achieving an R² score of 0.94 and a notably low RMSE of 227.99, indicating its superior capability in capturing complex, non-linear relationships in agricultural data. Random Forest and Gradient Boosting regressors also demonstrated robust performance with R² values of 0.88 and 0.84, respectively. In contrast, Linear Regression and Support Vector Regressor exhibited subpar predictive accuracy, particularly the SVR, which failed to generalize to the data (R² = -0.00).
This research highlights the efficacy of deep learning in enhancing crop yield prediction accuracy and underscores the limitations of simpler linear models in handling heterogeneous, high-dimensional agricultural data. The findings have practical implications for precision agriculture, enabling data-driven decision-making for farmers, agronomists, and policymakers. Future directions include incorporating meteorological and soil data, exploring temporal deep learning models such as LSTMs, and integrating explainable AI methods to interpret model predictions


Paper summary is not available for this article yet.

Loading PDF...

Loading Statistics...