Machine Learning Models for Urban Sewers: Balancing Predictive Power and Real-World Resilience

Researchers from Berlin University of Applied Sciences and Technology, University of Duisburg-Essen, and Einstein Center Digital Future evaluated neural network models for forecasting urban sewer overflows, balancing predictive accuracy, computational complexity, and resilience to real-world data disruptions. Their findings suggest combining global models like TFT for precision with local models like N-HiTS for robust, decentralized wastewater management.


CoE-EDP, VisionRICoE-EDP, VisionRI | Updated: 29-04-2025 08:42 IST | Created: 29-04-2025 08:42 IST
Machine Learning Models for Urban Sewers: Balancing Predictive Power and Real-World Resilience
Representative Image.

Urbanization is reshaping the world at an unprecedented pace, and the aging infrastructures of our cities are struggling to cope with the rising challenges of climate change. Nowhere is this stress more evident than in Combined Sewer Systems (CSS), originally designed decades ago to transport both stormwater and wastewater through a single network of pipes. With extreme rainfall events growing more frequent, CSS networks are increasingly overwhelmed, resulting in overflows that discharge untreated sewage into natural water bodies, endangering both ecosystems and public health. A new study, led by researchers from the Berlin University of Applied Sciences and Technology, the University of Duisburg-Essen, and the Einstein Center Digital Future in Berlin, tackles this escalating threat with the help of machine learning (ML). Moving beyond the limitations of traditional hydraulic modeling, researchers Vipin Singh, Tianheng Ling, Teodor Chiaburu, and Felix Biessmann propose a modern approach by systematically evaluating neural network (NN) architectures for CSS forecasting, seeking models that offer both high predictive performance and real-world resilience.

Global Versus Local: Two Paths Toward Predictive Wastewater Management

Rather than focusing purely on accuracy under ideal conditions, the study ventures boldly into real-world scenarios where sensors fail, communication networks break down, and environmental conditions become chaotic. The research compares two primary strategies: global models, which aggregate system-wide sensor data and incorporate weather forecasts, and local models, which depend only on individual sensors, providing a vital fallback when broader network data is unavailable. To find the best-performing approach, the team evaluated six advanced neural architectures: the Long Short-Term Memory network (LSTM), the probabilistic DeepAR, the efficient feed-forward N-HiTS, the attention-based Transformer, the convolution-driven Temporal Convolutional Network (TCN), and the hybrid Temporal Fusion Transformer (TFT).

Working with a rich dataset provided by Wirtschaftsbetreibe Duisburg, the team analyzed three years of hourly-resampled time series data from Duisburg’s Vierlinden district, covering 35 diverse sensor types and weather forecast inputs. After carefully preprocessing the data to handle missing and irregular readings, the researchers split it into training, validation, and test sets in chronological order to preserve seasonal patterns. Extensive hyperparameter optimization was conducted using the Optuna framework, ensuring that every model was tested under rigorously fair conditions. To gauge reliability, each model was trained 100 times with different random initializations, generating a comprehensive picture of performance variability.

The Race for Accuracy: Who Wins During Critical Overflows?

When evaluating predictive performance across the full test set, global models generally outperformed their local counterparts, particularly during critical peak overflow events when system management is most urgent. Among the global models, the Temporal Fusion Transformer (TFT) proved to be a standout performer, offering consistently superior forecasts, especially during the riskiest periods. However, when assessing less turbulent times in sewer system operations, local models such as N-HiTS and TFT demonstrated nearly equivalent predictive capability compared to global models. This finding is pivotal because it implies that decentralized models, critical for resilience during outages, can still deliver operationally useful forecasts without the need for full network data.

Peak-event analysis further revealed the strengths and weaknesses of each architecture. Recursive models like LSTM and DeepAR, which predict one step at a time, showed vulnerability to error accumulation. In contrast, multi-step forecasting models such as N-HiTS, TCN, and TFT handled these periods more robustly, maintaining accuracy over longer forecast horizons.

Bracing for Chaos: Testing Models Under Data Corruption

Acknowledging that real-world data is rarely perfect, the researchers stressed models further by introducing realistic data perturbations, injecting outliers, inserting missing values, and clipping sensor readings. This robustness testing unearthed significant differences among the models. Multi-step models such as TCN, N-HiTS, and TFT proved far more resilient to corruption, while LSTM and DeepAR struggled, seeing substantial accuracy declines under perturbations. TCN emerged as the most robust model overall, closely followed by N-HiTS and TFT, making them ideal candidates for deployment in unpredictable environments.

To present a balanced view of model viability, the study introduced two composite metrics: the Computational Complexity Index (CCI) and the Robustness Index (RI). These indices allowed the team to evaluate not just raw prediction accuracy, but the crucial trade-offs among computational burden, predictive stability, and resilience to corrupted data. Local models like N-HiTS stood out for their combination of lightweight computational demands, strong predictive performance, and outstanding resilience. Among global models, TFT again distinguished itself with excellent prediction and robustness but required significant computational resources, making it less practical for lightweight, decentralized deployment.

Toward Smarter Cities: A Blueprint for the Future

Ultimately, the study presents a clear, pragmatic path for the future of urban wastewater management. During normal operations, cities can rely on high-precision global models like TFT to manage sewer systems proactively. In moments of crisis, when parts of the sensor network fail or communication is disrupted, resilient local models like N-HiTS and TFT can step in to maintain continuity and protect urban environments from harmful overflows. The researchers’ open-source implementations make it easier for other cities and utilities to adapt and build upon their work, providing a flexible foundation for smarter, more resilient infrastructure.

Looking ahead, the team advocates for replicating their studies across different cities to validate generalizability. They suggest further research into model compression and quantization to enable more efficient deployment on IoT hardware. Moreover, they highlight the importance of exploring grouped sensor failures to understand cascading effects within networked systems. Most ambitiously, they call for head-to-head comparisons between machine learning models and physics-based hydraulic models, aiming to bridge the gap between engineering tradition and digital innovation, a critical step toward building sustainable, future-ready urban ecosystems.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback