Modeling Greenhouse Cucumber Evapotranspiration Using Machine Learning: A Random Forest Approach Versus Traditional and Non-linear Crop Coefficients

Document Type : Research/Original/Regular Article

Authors

1 Department of Irrigation and Reclamation Engineering, Faculty of Agricultural, College of Agriculture and Natural Resources,, University of Tehran, Karaj, Iran.

2 Associate Professor of Soil and Water Research Institute, Agricultural Research Education and Extension Organization (AREEO), Karaj, Iran.

Abstract

Extended Abstract

Introduction

Accurate estimation of crop evapotranspiration (ETc) is fundamental for the development of efficient irrigation strategies in greenhouse systems, where environmental conditions differ significantly from open-field farming. In Iran, greenhouse agriculture, particularly for cucumbers (Cucumis sativus L.), has expanded considerably, making irrigation optimization a critical priority. The specific microclimate within greenhouses, including controlled humidity, temperature, and solar radiation levels, affects plant water needs, requiring tailored approaches to predict ETc. Traditional models, like the FAO 56 crop coefficient (Kc) method, provide a standardized way to estimate ETc but are generally suited to field crops under variable outdoor conditions. The limitations of fixed Kc values in capturing the complexity of greenhouse environments have prompted the exploration of alternative models. In recent years, machine learning (ML) techniques, especially ensemble methods like the Random Forest (RF) algorithm, have emerged as promising tools for ETc modeling due to their capacity to manage non-linear interactions among meteorological variables and enhance model flexibility. This study evaluates the performance of three ETc estimation approaches for greenhouse-grown cucumber: the conventional FAO 56 Kc method, a non-linear Kc model using a third-degree polynomial, and direct ETc prediction through the RF algorithm. These methods are assessed across two growth cycles, autumn-winter (A-W) and spring-summer (S-S), to capture seasonal differences in crop water requirements.

Materials and Methods

The study was conducted in a research greenhouse located at the College of Agriculture and Natural Recourses, University of Tehran, focusing on daily ETc of cucumber over two distinct growth periods. Environmental parameters were measured both inside and outside the greenhouse, including maximum, minimum, and average temperatures, relative humidity, and solar radiation. Reference evapotranspiration inside the greenhouse (EToG) was derived using a micro-lysimeter installed with a turfgrass surface, while daily ETc was measured using a soil water balance method, where soil moisture content was monitored daily across three experimental plots to ensure precision. ETc calculations were performed through three modeling approaches. In the first approach, the FAO 56 Kc model estimated ETc by applying fixed crop coefficients and multiplying them by EToG. Although this method has been widely applied in field conditions, its applicability to greenhouses is limited due to fixed Kc assumptions. In the second approach, a non-linear Kc model was developed using third-degree polynomial regression on Kc values calculated as the ratio of ETc to EToG, capturing growth-stage specific variations. In the final approach, the RF model directly predicted ETc based on a broad range of meteorological inputs. To optimize the RF model, hyperparameters were tuned using Python’s GridSearchCV tool, and data were split into training (70%) and testing (30%) sets to validate model performance. After initial RF modeling, a feature selection process using Permutation Feature Importance (PFI) was applied to identify the most influential variables, refining the RF model to the top four parameters.

Results and Discussion

The results highlighted seasonal variability in cumulative ETc, with the S-S period exhibiting nearly double the ETc of the A-W period due to higher ambient temperature and increased solar radiation. These findings underscore the necessity of dynamic ETc models that can accommodate seasonal and environmental variations. The FAO 56 Kc method produced a mean RMSE of 0.915 mm/day across both growth cycles, demonstrating limitations in fixed Kc approaches under greenhouse conditions. The non-linear Kc model, with an average RMSE of 0.64 mm/day, provided improved accuracy by adjusting Kc values across different growth stages, especially during mid-growth when water demand peaks. This improvement aligns with the premise that non-linear models can better capture the ETc variability within controlled environments. The RF algorithm demonstrated superior accuracy and flexibility, outperforming both Kc-based models with R² values of 0.96 for the A-W period and 0.94 for the S-S period in training datasets, and with respective RMSE values of 0.365 mm/day and 0.57 mm/day in testing datasets. These results illustrate the RF model’s capacity to accurately model ETc by capturing complex, non-linear interactions among variables such as maxTG (maximum temperature inside the greenhouse), meanRHG (average relative humidity inside the greenhouse), and RadiationG (solar radiation inside the greenhouse) during the A-W period, with RadiationG and EToG emerging as key variables during the S-S period. By emphasizing critical seasonal drivers of ETc, the RF model offers a robust alternative that adjusts to environmental changes without relying on static Kc values. This adaptability supports RF’s potential as a powerful tool for ETc estimation, accurately reflecting seasonal influences on greenhouse crop water needs.

Conclusion

The findings from this study demonstrate that the RF algorithm, when applied to ETc modeling in greenhouse conditions, provides a flexible, high-accuracy alternative to traditional Kc methods. Unlike the FAO 56 Kc and non-linear Kc models, which rely on predefined or growth-stage specific coefficients, the RF approach enables direct ETc prediction using real-time meteorological data. By optimizing input variables through feature selection, RF efficiently reduced the model complexity, focusing on the top four influential parameters while retaining high predictive accuracy. This reduction not only streamlines data collection requirements but also enhances the model's applicability in practical greenhouse operations. The results indicate that RF's capacity to model complex relationships among variables makes it especially suited for greenhouse environments, where precision irrigation is crucial for sustainable water management. Ultimately, this research underscores the importance of integrating machine learning techniques in ETc estimation, providing greenhouse operators with adaptive, resource-efficient tools for managing water use in controlled agricultural settings.

Keywords

Main Subjects



Articles in Press, Accepted Manuscript
Available Online from 11 December 2024
  • Receive Date: 10 November 2024
  • Revise Date: 08 December 2024
  • Accept Date: 11 December 2024