Application and Comparison of Missing Groundwater Level Data Interpolation Methods with an Emphasis on DeepMVI Performance (Case Study: Ajabshir Plain)

Document Type : Research/Original/Regular Article

Authors

1 MSc of Water Recourse Engineering, Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran

2 Assistant Professor, Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran

3 Associate Professor, Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran

10.22098/mmws.2025.17457.1601

Abstract

Groundwater is a vital water resource, especially in arid and semi-arid regions such as northwestern Iran. It plays a crucial role in agriculture, drinking water supply, and industrial activities. Therefore, reliable monitoring and management of groundwater levels are essential for sustainable development. However, missing data in groundwater level time series caused by factors like equipment failure, inaccessible terrain, or extreme weather can hinder accurate analysis and prediction. To address this, interpolation techniques are used to estimate missing values based on observed data. The reliability of these techniques depends on the quantity, spatial distribution, and temporal resolution of the available data. In recent years, machine learning and deep learning methods have shown promise in handling complex, nonlinear, and high-dimensional datasets. This study evaluates the effectiveness of five interpolation methods Kriging, Inverse Distance Weighting (IDW), Piecewise Cubic Hermite Interpolating Polynomial (PCHIP), Random Forest Spatial Interpolation (RFSI), and Deep Missing Value Imputation (DeepMVI) to reconstruct missing groundwater level data. The focus is on improving data completeness and accuracy for subsequent groundwater analyses. The case study is the Ajabshir aquifer, where long-term data from 29 piezometric wells are used. The objective is to compare the performance of traditional and modern interpolation approaches and to determine the most accurate method for handling missing groundwater level data.

Materials and Methods

In this study, groundwater level data from 29 piezometric wells in the Ajabshir aquifer in northwest Iran were analyzed monthly over a 17-year period (2006–2022). Due to various operational and environmental constraints, numerous gaps were observed in the dataset.

To estimate the missing values, five interpolation methods were evaluated: Kriging, Inverse Distance Weighting (IDW), Piecewise Cubic Hermite Interpolating Polynomial (PCHIP), Random Forest Spatial Interpolation (RFSI), and Deep Missing Value Imputation (DeepMVI). Kriging uses semivariograms to model spatial dependence and provides statistically unbiased estimates. IDW is a deterministic technique based on the inverse distance to known values. PCHIP maintains the monotonicity and continuity of time-series data. RFSI applies the Random Forest algorithm to capture nonlinear spatial relationships, and DeepMVI utilizes deep learning to model complex temporal and multivariate dependencies in the data.

The dataset was randomly divided into training (70%) and testing (30%) subsets. The performance of each method was assessed using the correlation coefficient (R), root mean square error (RMSE), and Nash–Sutcliffe efficiency (NSE).

Results and Discussion

The evaluation results show significant variation in model accuracy. The Kriging method, while widely used, showed poor performance in this study due to the sparse and irregular distribution of observation wells. Its results included a low correlation (R = 0.37), high RMSE (417.91), and low NSE (0.11), indicating that this method is not suitable under conditions with extensive missing data and limited spatial continuity. The IDW method improved over Kriging but still yielded moderate accuracy (R = 0.56, RMSE = 365.51, NSE = 0.30).

The PCHIP method performed considerably better, reflecting its ability to handle temporal data smoothly. It achieved R = 0.89, RMSE = 7.52, and NSE = 0.72, making it the second most accurate method. The method preserved the shape of the original groundwater level trends and was effective in reconstructing long sequences of missing data. The RFSI method, which leverages machine learning, showed better accuracy than Kriging and IDW (R = 0.63, RMSE = 11.06, NSE = 0.40), although it was outperformed by PCHIP and DeepMVI. This suggests that while machine learning can improve performance, spatial interpolation with sparse data remains challenging. The DeepMVI method outperformed all other methods, achieving the highest correlation (R = 0.92), lowest RMSE (6.44), and highest NSE (0.80). Its ability to capture both spatial and temporal relationships using a hybrid deep neural architecture makes it highly effective in imputing missing groundwater data, especially when the dataset includes complex time-dependent patterns and multivariate interactions.

The final comparison of time series plots across 29 piezometric wells also visually confirmed the accuracy of the DeepMVI model in maintaining original trends and minimizing noise or abrupt changes. These results demonstrate that deep learning models offer a promising approach for improving the quality and reliability of groundwater monitoring datasets.

Conclusion

This research evaluated the performance of five interpolation methods for reconstructing missing groundwater level data from 29 piezometric wells in the Ajabshir aquifer over a 17-year period. Among the methods tested, DeepMVI outperformed all others, providing the most accurate and reliable results. Its ability to model complex temporal and spatial dependencies makes it particularly suitable for environmental datasets with high variability and missing values. PCHIP and RFSI also performed well and could serve as viable alternatives when deep learning infrastructure is not available. Although Kriging and IDW are widely used in hydrogeological studies, their lower performance in this study suggests that their application may be limited under conditions of sparse or irregular data. The study highlights the importance of selecting appropriate interpolation methods based on data characteristics. DeepMVI, with its robust architecture, holds significant promise for future groundwater studies and can enhance the quality of groundwater monitoring systems by providing more complete and accurate datasets. This, in turn, can improve water resource management and planning in regions facing water scarcity and environmental stress.

Keywords

Main Subjects



Articles in Press, Accepted Manuscript
Available Online from 07 June 2025
  • Receive Date: 18 May 2025
  • Revise Date: 07 June 2025
  • Accept Date: 07 June 2025