Machine learning dives deep, but groundwater research still runs dry on collaboration
Machine learning has enormous potential to advance groundwater sustainability. But scientific gains must be matched by global collaboration and policy relevance the study concluded.

Global research into groundwater is being reshaped by machine learning, but a lack of international cooperation could hamper its effectiveness, a new study warns.
The analysis, titled "Overviewing the Machine Learning Utilization on Groundwater Research Using Bibliometric Analysis" and published in Water, reviewed 1,797 peer-reviewed publications from 2000 to 2023 and found a surge in machine learning (ML) applications aimed at predicting water quality, modeling contamination, and optimizing groundwater resource management. However, only 2.4% of the studies involved international co-authorship, despite the transboundary nature of aquifers and shared water crises.
Researchers from Istanbul Technical University, the University of Silesia, and National Cheng Kung University conducted the bibliometric study to evaluate global trends, institutional output, and thematic gaps in ML-groundwater research. The report cites a 32% annual growth rate in related publications and a significant shift toward deep learning and hybrid modeling techniques.
China and the United States led the field in total publications. China produced over 300 papers but ranked low in international collaboration. The United States contributed roughly 250 and showed stronger multi-country participation. Iran and India also posted high output, with Iran ranking second in citations.
The authors noted that publication growth has not been matched by cooperative research. “Despite an expanding body of work, collaboration across national borders remains rare,” the report said. The researchers called for expanded partnerships, particularly in regions where groundwater sources are shared across boundaries.
Jilin University in China, the University of Tehran, and the University of California ranked as the most productive institutions. The Journal of Hydrology, Science of the Total Environment, and Water Resources Research were identified as the most influential journals in the field. Hydrology and environmental science dominated the publishing landscape, but a growing presence of interdisciplinary sources like Remote Sensing indicated increased cross-sector integration.
The analysis identified “machine learning” and “groundwater” as the most frequent keywords, followed by “random forest,” “support vector machine,” “artificial neural networks,” and “groundwater potential.” Tools such as GIS, remote sensing, and water quality indices have become standard in ML-groundwater modeling. Deep learning approaches - especially convolutional and recurrent neural networks - are gaining traction due to their ability to process complex spatial and temporal data.
The report also found that machine learning is being widely applied to predict aquifer behavior, monitor contamination, and assess groundwater vulnerability. GRACE (Gravity Recovery and Climate Experiment) and InSAR (Interferometric Synthetic Aperture Radar) datasets are increasingly being used in conjunction with ML tools for large-scale hydrological forecasting.
Despite these advances, the authors warned of a critical weakness: interpretability. Many high-performing models, particularly deep learning systems, function as “black boxes,” offering predictions without insight into how those conclusions were reached. The study noted that while accuracy remains high, explainability remains low - undermining adoption by policymakers and water management agencies.
"Decision-makers need transparency. Models must provide interpretable outputs to be trusted and applied in real-world governance," the report said.
The researchers highlighted SHAP (Shapley Additive Explanations) and permutation importance methods as promising approaches to increase model explainability. However, they noted limited use of these techniques in current literature.
Another barrier identified was uneven access to computing infrastructure. While high-resource institutions increasingly adopt deep learning, researchers in the Global South face constraints due to limited hardware, datasets, and technical training. The authors warned that this imbalance could widen the global research gap and limit ML’s real-world application in the most water-stressed regions.
The study also raised concerns about data fragmentation. With few shared datasets or standardized modeling protocols, comparisons across regions and studies remain difficult. The authors called for global repositories of open-access groundwater data to support reproducibility and interoperability.
Recommendations include expanding international co-authorship, improving model transparency, integrating ML with physics-based hydrological models, and increasing funding for capacity building in low-income countries.
The researchers said future efforts should focus on transboundary aquifer systems and encourage closer collaboration between hydrologists, AI developers, and public policy institutions. They also proposed that institutions and journals adopt reporting standards that include model explainability and dataset documentation.
Machine learning has enormous potential to advance groundwater sustainability. But scientific gains must be matched by global collaboration and policy relevance the study concluded.
- FIRST PUBLISHED IN:
- Devdiscourse