Explainable AI reveals hidden thresholds driving sudden urban water disasters


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 04-12-2025 11:14 IST | Created: 04-12-2025 11:14 IST
Explainable AI reveals hidden thresholds driving sudden urban water disasters
Representative Image. Credit: ChatGPT

Extreme rainfall, rapid urbanization and aging drainage systems are pushing many major cities into higher levels of pluvial flood risk, raising the urgency for data-driven planning tools that can pinpoint the conditions that trigger street-level flooding. 

A recent study titled “Leveraging Explainable Artificial Intelligence for Place-Based and Quantitative Strategies in Urban Pluvial Flooding Management,” published in the ISPRS International Journal of Geo-Information, uses advanced machine learning and explainable AI techniques to quantify how surface patterns, vegetation levels and built-environment characteristics influence urban flooding. The findings are based on high-resolution data from the central urban districts of Guangzhou, China, a region where dense development and seasonal rainstorms routinely converge to produce severe pluvial flooding.

Machine learning reveals nonlinear flood drivers

The study is based on Guangzhou’s six core districts, which represent one of China’s most densely populated and economically active urban regions. These districts share a long history of recurring pluvial flooding driven by intense summer rainfall, uneven urbanization patterns and outdated drainage systems in older neighborhoods. While previous studies have used statistical or machine learning methods to map susceptibility, the authors argue those models often fall short because they treat the city as a uniform unit rather than a patchwork of highly varied micro-environments.

To address this, the authors trained three ensemble learning models, Random Forest, Gradient Boosting Decision Trees and XGBoost, using detailed environmental, hydrological, topographic and land-use datasets. They then applied SHAP, an explainable artificial intelligence method, to interpret how each variable influenced the model’s predicted flooding probability. The interpretability step is central to the study because SHAP calculates the contribution of each variable at the local scale, allowing the team to pinpoint which factors dominate in each grid cell.

The XGBoost model achieved the strongest performance, with high accuracy and stable predictive behavior across all tests. Using this model, the authors produced a high-resolution flood susceptibility map showing that extremely high-risk areas cluster across much of the old town and southern new town regions, while the mountainous portions of the developing area reflect very low susceptibility.

The SHAP analysis revealed that human-altered land surface variables, especially impervious surface density (ISD) and vegetation, measured through kNDVI, exert far greater influence on flood susceptibility than climate or terrain within the central districts. This distinction arises because rainfall patterns and topography vary little inside the urban core, while land-use patterns differ sharply across neighborhoods shaped by different construction eras and planning philosophies.

The machine learning model also uncovered nonlinear effects that traditional statistical approaches struggle to detect. Impervious surface density, for example, does not increase flood risk at a constant rate. Instead, two threshold points emerged: below 0.30, ISD mitigates flooding; between 0.30 and 0.80, its influence stabilizes; above 0.80, susceptibility rises sharply. Vegetation exhibited a similar pattern, with a single threshold at 0.25: values below this level worsen flooding, while values above it provide consistent protection.

These thresholds provide a quantitative basis for decision-making that was previously unavailable. They are derived from the model’s internal behaviors and were validated through a statistical bootstrap analysis, showing remarkable stability across multiple iterations.

Local dominant variables show sharp regional differences

The study identifies dominant flood-driving variables at fine spatial scales. By analyzing the top three positive SHAP values for each grid, the authors constructed a map showing which factors push susceptibility higher in each location.

The results highlight stark contrasts across Guangzhou’s districts. In the old town, characterized by aging buildings, higher densities and concentrated commercial areas, impervious surface density is overwhelmingly the primary driver of susceptibility. The lack of vegetative cover and constrained drainage systems amplify this effect. Vegetation (kNDVI) emerges as a frequent secondary or tertiary driver, showing how the scarcity of green spaces compounds underlying vulnerabilities.

In the new town, where high-rise developments and modern commercial hubs dominate, building height becomes a more influential factor. The model identifies building height as a proxy for broader subregional characteristics: lower high-density clusters often indicate insufficient drainage or older infrastructure, while taller planned districts tend to be supported by stronger drainage capacity and more integrated land-use designs.

In the developing area, which includes both expanding built-up zones and extensive mountainous terrain, the dominant drivers shift more toward natural variables such as precipitation levels, distance to water and elevation. Many mountainous zones show no dominant variables at all because their natural characteristics minimize pluvial flooding risks.

These findings demonstrate that no single mitigation strategy can address flood risk across the entire city. Instead, the approach highlights the need for tailored, location-specific interventions based on a neighborhood’s unique environmental and infrastructural profile.

To illustrate how these differences shape outcomes, the authors analyzed three representative areas from the old town, new town and developing area. The old town sample had high SHAP contributions from vegetation, building structure and rainfall, showing that both outdated infrastructure and intensified local rainfall worsen susceptibility. The new town example displayed susceptibility driven by impervious surfaces and modern building patterns, while the developing area sample reflected strong contributions from insufficient vegetation and building height. These patterns reinforce that flooding drivers are not uniform but instead emerge through complex interactions between land use, infrastructure and local climate dynamics.

New thresholds provide roadmap for urban renewal

Building on the model outputs, the authors developed a set of place-based and quantitative recommendations to reduce flooding in Guangzhou’s central districts. These recommendations specify how municipalities should modify land-use variables to bring them within the safer ranges identified by SHAP.

For areas where impervious surfaces dominate flood susceptibility, planners are advised to ensure that ISD remains below 0.80. Below this point, susceptibility drops significantly. In older neighborhoods where ISD is far higher, interventions such as expanding green corridors, reducing surface paving during redevelopment and redesigning ground-level commercial layouts can help bring the ratio into a safer zone.

Where vegetation is the dominant factor, maintaining kNDVI above 0.25 becomes essential. This threshold reflects the point at which vegetation begins to reliably absorb runoff and improve infiltration. Increasing neighborhood-level greenery, adding micro-parks, and integrating roadside vegetation strips can help achieve this target.

In zones where natural variables rather than human-altered surfaces dominate susceptibility, the findings suggest that drainage infrastructure improvements should be prioritized. Terrain, rainfall and surface hydrology cannot be easily reconfigured, so upgrading drainage capacity becomes the most effective mitigation strategy.

The study also brings to light the role of building height. In many low-rise clusters with dense construction, low BH values reflect outdated development patterns and inadequate drainage. In such areas, planners should reduce overcrowding and improve drainage capacity during redevelopment efforts. Conversely, high-rise districts with modern layouts typically show better performance and may require only localized adjustments.

To guide citywide planning, the authors provide separate recommendations for the old town, new town and developing areas.

In the old town, where historical patterns have produced concentrated impervious surfaces and limited greenery, increasing pervious surfaces is a top priority. Creating a “high-low-high” pattern of impervious surfaces, where paved areas are broken up by vegetation, can help approach the ideal ISD and kNDVI thresholds. Improving drainage system resilience is also critical due to the influence of the urban heat island and rain island effects, which elevate rainfall intensity.

In the new town, the authors argue that only localized interventions are needed. Modern districts already incorporate more balanced land-use patterns, so strategies such as green roofs and targeted drainage upgrades can help bring ISD and kNDVI values closer to recommended thresholds without major redevelopment.

The developing area requires more comprehensive planning, as rapid construction could lock in future vulnerabilities. The study recommends proactive redesigns of impervious surface distribution, green space planning, building layout and drainage integration to ensure that new districts are resilient from the outset. Coupling SHAP interpretations with hydrological optimization models can support spatially efficient planning during this growth phase.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback