Monthly prediction of pan evaporation using individual and combined approaches of data mining models in arid regions

Document Type : Research/Original/Regular Article

Authors

1 Assistant Professor , Department of Desert Management & Control, Faculty of Natural Resource, Higher education complex of saravan,

2 Assistant Professor (Corresponding Author), Department of Desert Management & Control, Faculty of Natural Resource, Higher Educational Complex of Saravan,Saravan, Iran

3 High Education Complex of Saravan, Pasdaran Street, Saravan city, Sistan va Baluchestan Province, IRAN

Abstract

Introduction
Evaporation, the process by which water molecules escape a surface after absorbing sufficient energy to overcome vapor pressure, is a major contributor to water scarcity, especially in arid and semi-arid regions where heat readily facilitates this escape. Accurately estimating evaporation losses is crucial for effective water resource management, crop water demand prediction, and irrigation scheduling. Machine learning (ML) has emerged as a powerful tool for tackling the complex and stochastic nature of environmental problems. ML models excel at identifying relationships between predictor variables and outcomes (predictands), often surpassing traditional methods. However, their performance can vary depending on input factors and climatic conditions. Recently, hybrid techniques that combine multiple models have gained traction in climate and hydrology studies. These techniques leverage the strengths of different approaches within a single algorithm, potentially capturing more complex patterns in data series. This research will explore the potential of various individual ML models and propose a novel hybrid approach for estimating pan evaporation in Sistan and Baluchistan Province.
 
Materials and Methods
This study investigates pan evaporation simulation and prediction in Sistan and Baluchistan Province, Iran. Synoptic station data (1980-2019) served as model inputs, while pan evaporation measurements from these stations provided the observed values. In this research, in the approach of individual performance of data mining models, eight data mining models were used to simulate and predict evaporation from the pan. In addition to the individual performance approach, the combined VEDL approach was used to provide a hybrid model (a combination of the mentioned eight individual models of deep learning). In this hybrid approach to regression issues, the estimators of all models are averaged to obtain an estimate for a set called vote regressors (VRs). There are two approaches to awarding votes: average voting (AV) and weighted voting (WV). In the case of AV, the weights are equivalent and equal1. A disadvantage of AV is that all of the models in the ensemble are accepted as equally effective; however, this situation is very unlikely, especially if different machine learning algorithms are used. WV specifies a weight coefficient for each ensemble member. The weight can be a floating-point number between Zero and one, in which case the sum is equal to one, or an integer starting at one denoting the number of votes given to the corresponding ensemble member. the weight of each model was selected based on the accuracy of the model's performance using the evaluation criteria obtained from the training implementation section of individual models. the model’s performance was assessed using statistical measures, including R2, RMSE, MAE, and Taylor diagram.
 
Results and Discussion
The results showed that all the models had very good results in both the training and testing stages. All models exhibited excellent performance during training and testing. The Artificial Neural Network (ANN) achieved the highest accuracy in both phases at the Zahedan station (R² = 0.89, RMSE = 45.95 in training; R² = 0.96, RMSE = 44.18 in validation). It emerged as the best model for monthly pan evaporation prediction at this station. Other models also performed well, with the Support Vector Machine (SVM) and Random Forest (RF) models achieving R² values of 0.89 and 0.88 in training, respectively. Notably, the BART model ranked second in validation (R² = 0.96). The Tree Model (TM) had the lowest accuracy (R² = 0.84 and 0.93 in training and validation, respectively). Across all stations, ANN, SVM, and RF consistently delivered the best results in both training and testing. In the test phase, the SVM model outperformed others in Khash, Iranshahr, and Chabahar stations (R² = 0.94, 0.96, and 0.94, respectively). At the Saravan station, the RF model achieved the highest R² (0.94) during testing. To develop a hybrid data mining model, the Voting Ensemble for Deep Learning (VEDL) technique was employed with weighted voting in the training stage. The combined model significantly improved upon the best individual model. RMSE decreased from 45.95 to 33.1, R² increased from 0.89 to 0.94, and MAE improved from 32.92 to 23.9. Evaluation using the Taylor diagram further confirmed the superior performance of the VEDL model compared to the individual ANN model.
 
Conclusion
The results showed that among all the models, ANN, SVM, and RF models had the best performance in the two stages of training and verification. In the validation stage, the SVM model with R2 values equal to 0.94, 0.96, and 0.94 performed best in the Khash, Iranshahr, and Chabahar stations. At the Saravan station, in the Sensji validity stage, the RF model with an R2 value of 0.94 had the best performance among the models. The excellent performance of the models in the two stages of training and validation is another finding of the research, These results are consistent with the results of researchers who have expressed the appropriate efficiency of machine learning models in estimating evaporation/evaporation and transpiration in different climatic regions of Iran. The results of the combined model showed that the combined model improved the results compared to the best individual model so that the RMSE values increased from 45.95 to 33.1, the R2 values increased from 0.89 to 0.94, and the MAE value improved from 32.92 to 23.9. The use of the VEDL approach to estimate evaporation from the pan was a new approach that has not been used in past studies. Therefore, according to the results of this research, the proposed deep sensing model is proposed to estimate the evaporation of arid and semi-arid areas for water resources management and agricultural planning.

Keywords

Main Subjects


References
Abd-Elaty, I., Kushwaha, N.L., Grismer, M.E., Elbeltagi, A., & Kuriqi, A. (2022). Cost-effective management measures for coastal aquifers affected by saltwater intrusion and climate change. Science of The Total Environment836, 155656. doi:10.1016/j.scitotenv.2022.155656
Adnan, R.M., Petroselli, A., Heddam, S., Santos, C.A.G., & Kisi, O. (2021). Comparison of different methodologies for rainfall–runoff modeling: machine learning vs conceptual approach. Natural Hazards105, 2987-3011.‏ doi: 10.1007/s11069-020-04438-2
Chen, K., Chen, H., Zhou, C., Huang, Y., Qi, X., Shen, R., & Ren, H. (2020). Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Research171, 115454.‏ doi:10.1016/j.watres.2019.115454
Chen, X.W., & Lin, X. (2014). Big data deep learning: challenges and perspectives. IEEE access2, 514-525.‏ doi:10.1109/ACCESS.2014.2325029
Elbeltagi, A., Al-Mukhtar, M., Kushwaha, N.L., Al-Ansari, N., & Vishwakarma, D.K. (2023). Forecasting monthly pan evaporation using hybrid additive regression and data-driven models in a semi-arid environment. Applied Water Science13(2), 42.‏ doi:10.1007/s13201-022-01846-6
Erdebilli, B., & Devrim-İçtenbaş, B. (2022). Ensemble voting regression based on machine learning for predicting medical waste: a case from Turkey. Mathematics, 10(14), 2466.‏ doi:10.3390/math10142466
Feng, K., & Tian, J. (2021). Forecasting reference evapotranspiration using data mining and limited climatic data. European Journal of Remote Sensing, 54(sup2), 363-371.‏ doi:10.1080/22797254.2020.1801355
Gholami, H., Mohamadifar, A., Sorooshian, A., & Jansen, J.D. (2020). Machine-learning algorithms for predicting land susceptibility to dust emissions: The case of the Jazmurian Basin, Iran. Atmospheric Pollution Research11(8), 1303-1315.‏ doi;10.1016/j.apr.2020.05.009
Granata, F., & Di Nunno, F. (2021). Forecasting evapotranspiration in different climates using ensembles of recurrent neural networks. Agricultural Water Management255, 107040.‏ doi:10.1016/j.agwat.2021.107040
Granata, F., Gargano, R., & de Marinis, G. (2020). Artificial intelligence-based approaches to evaluate actual evapotranspiration in wetlands. Science of The Total Environment703, 135653.‏ doi:10.1016/j.scitotenv.2019.135653
Hinton, G.E., & Salakhutdinov, R.R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.‏ doi: 10.1126/science.1127647.
Khan, R.A., El Morabet, R., Mallick, J., Azam, M., Vambol, V., Vambol, S., & Sydorenko, V. (2021). Rainfall Prediction using Artificial Neural Network in Semi-Arid mountainous region, Saudi Arabia. Ecological Questions, 32(4), 127-133.‏ doi:10.12775/EQ.2021.038.
Khan, N., Shahid, S., Ismail, T.B., & Wang, X.J. (2019). Spatial distribution of unidirectional trends in temperature and temperature extremes in Pakistan. Theoretical and Applied Climatology136, 899-913.‏
Kisi, O., Mansouri, I., & Hu, J.W. (2017). A new method for evaporation modeling: dynamic evolving neural-fuzzy inference system. Advances in Meteorology1, 1-9.‏ doi:10.1155/2017/5356324.
Kushwaha, N.L., Rajput, J., Elbeltagi, A., Elnaggar, A.Y., Sena, D.R., Vishwakarma, D.K., & Hussein, E.E. (2021). Data intelligence model and meta-heuristic algorithms-based pan evaporation modelling in two different agro-climatic zones: a case study from Northern India. Atmosphere12(12), 1654.‏ doi:10.3390/atmos12121654
Kushwaha, N.L., Rajput, J., Sena, D.R., Elbeltagi, A., Singh, D.K., & Mani, I. (2022). Evaluation of data-driven hybrid machine learning algorithms for modelling daily reference evapotranspiration. Atmosphere-Ocean60(5), 519-540.‏ doi:10.1080/07055900.2022.2087589
Lundberg, A. (1993). Evaporation of intercepted snow-review of existing and new measurement methods. Journal of Hydrology151(2-4), 267-290.‏ doi:10.1016/0022-1694(93)90239-6
Malik, A., Tikhamarine, Y., Al-Ansari, N., Shahid, S., Sekhon, H.S., Pal, R.K., Rai, P., Pandey, K., Singh, P., Elbeltagi, A., & Sammen, S.S. (2021). Daily pan-evaporation estimation in diferent agro-climatic zones using novel hybrid support vector regression optimized by Salp swarm algorithm in conjunction with gamma test. Engineering Applications of Computational Fluid Mechanics, 15, 1075-1094. doi:10.1080/19942060.2021.1942990
Masoner, J.R., Stannard, D.I., & Christenson, S.C. (2008). Differences in evaporation between a floating pan and class a pan on land 1. Journal of the American Water Resources Association44(3), 552-561.‏ doi:10.1111/
j.1752-1688.2008.00181.x
Mohammadi, M., Forozanfard, M., & Gholami, H. (2022). Predicting pan evaporation in a hyper-arid climate using soft computing models: A Case Study of Sistan Plain, Sistan-Baluchistan, Iran. Desert Ecosystem Engineering11(36), 71-82. doi:10.22052/deej.2021.11.36.43.[In Persian]
Mohammadi, M., Vagharfard, H., Mahdavi Najafabadi, R., Daneshkar Arasteh, P., & Nazemosadat, M.J. (2021). Rainfall-runoff modelling of coastal watersheds near Hormuz strait using data mining. Iranian Journal of Soil and Water Research52(2), 313-327. doi: 10.22059/ijswr.2021.309641.668732. [In Persian]
Mohammadifar, A., Gholami, H., & Golzari, S. (2023). Stacking-and voting-based ensemble deep learning models (SEDL and VEDL) and active learning (AL) for mapping land subsidence. Environmental Science and Pollution Research30(10), 26580-26595.‏
Mosavi, A., Sajedi Hosseini, F., Choubin, B., Taromideh, F., Ghodsi, M., Nazari, B., & Dineva, A.A. (2021). Susceptibility mapping of groundwater salinity using machine learning models. Environmental Science and Pollution Research28, 10804-10817.‏ doi:10.1007/s11356-020-11319-5.
Naganna, S.R., Deka, P.C., Ghorbani, M.A., Biazar, S. M., Al-Ansari, N., & Yaseen, Z.M. (2019). Dew point temperature estimation: application of artificial intelligence model integrated with nature-inspired optimization algorithms. Water11(4), 742.‏ doi:10.3390/w11040742
Parisouj, P., Mohebzadeh, H., & Lee, T. (2020). Employing machine learning algorithms for streamflow prediction: a case study of four river basins with different climatic zones in the United States. Water Resources Management34, 4113-4131.‏ doi: 10.1007/s11269-020-02659-5
Rahman, A.S., Hosono, T., Quilty, J.M., Das, J., & Basak, A. (2020). Multiscale groundwater level forecasting: Coupling new machine learning approaches with wavelet transforms. Advances in Water Resources141, 103595.‏ doi:10.1016/j.advwatres.2020.103595
Rezaie-Balf, M., Attar, N.F., Mohammadzadeh, A., Murti, M.A., Ahmed, A.N., Fai, C.M., & El-Shafie, A. (2020). Physicochemical parameters data assimilation for efficient improvement of water quality index prediction: Comparative assessment of a noise suppression hybridization approach. Journal of Cleaner Production271, 122576.‏ doi:10.1016/j.jclepro.2020.122576
Sabzevari, Y., & Ghanbarpouri, M. (2022). Evaluation of experimental and intelligent models in estimation of reference evapotranspiration: Case Study Aligodarz. Desert Ecosystem Engineering11(36), 17-30. doi: ‎10.22052/deej.2023.248181.0 [In Persian]
Salih, S.Q., Sharafati, A., Ebtehaj, I., Sanikhani, H., Siddique, R., Deo, R.C., & Yaseen, Z.M. (2020). Integrative stochastic model standardization with genetic algorithm for rainfall pattern forecasting in tropical and semi-arid environments. Hydrological Sciences Journal65(7), 1145-1157.‏ doi:10.1080/02626667.2020.1734813
Seyedi, S.N., Fazloula, R., Masoudian, M., & Kia, I. (2022). Evaluation the performance of different models of artificial neural network in estimating evaporation losses from pan around the Shahid Rajaei Dam Lake. Irrigation and Water Engineering13(2), 179-196.‏ doi:10.22125/iwe.2022.162631
Shahabi, S., Azarpira, F., & Barzkar, A. (2020). Estimation of daily and weekly evapotranspiration using hybrid approaches of soft computing. Iranian Journal of Irrigation & Drainage14(5), 1550-1561. dor:20.1001.1.20087942.1399.14.5.5.6. [In Persian]
Sze, V., Chen, Y.H., Yang, T.J., & Emer, J.S. (2017). Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE105(12), 2295-2329.‏ doi:10.48550/arXiv.1703.09039
Taylor, K.E. (2001). Summarizing multiple aspects of model performance in a single diagram. Journal of geophysical research: atmospheres106(D7), 7183-7192. doi:10.1029/2000JD900719
Vishwakarma, D.K., Pandey, K., Kaur, A., Kushwaha, N.L., Kumar, R., Ali, R., & Kuriqi, A. (2022). Methods to estimate evapotranspiration in humid and subtropical climate conditions. Agricultural Water Management261, 107378. doi:10.1016/j.agwat.2021.107378
Wu, L., Huang, G., Fan, J., Ma, X., Zhou, H., & Zeng, W. (2020). Hybrid extreme learning machine with meta-heuristic algorithms for monthly pan evaporation prediction. Computers and electronics in agriculture168, 105115.‏ doi:10.1016/j.compag.2019.105115
Zhao, L., Xia, J., Xu, C.Y., Wang, Z., Sobkowiak, L., & Long, C. (2013). Evapotranspiration estimation methods in hydrological models. Journal of Geographical Sciences23, 359-369.‏ doi:10.1007/s11442-013-1015-9.