IoT and ensemble AI transform congestion prediction in urban bus networks
For millions of commuters worldwide, public buses remain the backbone of daily mobility. However, inconsistent travel times and unpredictable congestion continue to undermine service reliability, particularly in cities where bus rapid transit systems operate alongside mixed traffic.
The research paper Ensemble Machine Learning Approach for Traffic Congestion and Travel Time Prediction in Urban Bus Rapid Transit Systems: A Case Study of Trans Metro Bandung, published in IoT, demonstrates how ensemble machine learning integrated with onboard GPS tracking can significantly enhance congestion classification and travel time forecasting.
High-accuracy congestion detection using ensemble models
To classify traffic conditions, the researchers evaluated Decision Tree and Random Forest classifiers. Using an 80:20 train-test split and hyperparameter tuning, the models were optimized for performance while maintaining interpretability.
The Decision Tree classifier achieved a testing accuracy of 96.8 percent, with strong precision, recall, and F1 scores. Random Forest performed nearly as well, achieving 96.7 percent testing accuracy. Although both models showed robust results, the Decision Tree was selected as the preferred congestion classifier due to its slightly higher accuracy and ease of interpretation.
The classification task categorized traffic into congested, moderate, and smooth conditions. High true positive rates across categories demonstrated reliable differentiation between varying congestion levels. For transit authorities, such accuracy enables proactive scheduling adjustments and route management decisions.
Tree-based ensemble models often outperform more complex deep learning architectures when applied to structured, tabular IoT data. While neural networks excel with image and unstructured inputs, ensemble models provide superior efficiency and transparency for GPS-based datasets.
Travel time forecasting: Random forest leads in accuracy
The second predictive task focused on estimating travel time between departure and arrival points, a metric that directly affects rider trust and operational planning. The researchers compared Random Forest Regressor and XGBoost Regressor models under multiple configurations to ensure reproducibility.
Random Forest demonstrated stronger generalization performance. Testing accuracy reached over 90 percent under certain configurations, outperforming XGBoost, which ranged between 86 and 87 percent. Error metrics further highlighted Random Forest’s advantage, with a lower root mean squared error of 5.80 minutes compared to XGBoost’s 6.58 minutes.
When predictions were compared against actual observed travel durations, Random Forest consistently produced closer approximations. Across multiple validation cases, its predictions aligned more reliably with real-world travel times, making it the preferred regression model for deployment.
The study also benchmarked its results against other algorithms reported in transportation research, including LSTM, ARIMA, SVM, and MLP models. In comparison, the Random Forest model trained on structured GPS data achieved lower error rates, suggesting that simpler ensemble approaches can outperform deep learning alternatives in certain operational contexts.
Low-cost deployment for smart transit systems
The trained models were implemented directly on Raspberry Pi 3B microcontrollers installed onboard buses. This edge computing approach enables real-time prediction without complete reliance on constant cloud connectivity.
Such a design is particularly relevant for cities with limited digital infrastructure budgets. By combining inexpensive hardware with robust machine learning models, transit systems can upgrade predictive capabilities without major capital investment.
Although the case study centers on Bandung’s Trans Metro network, the authors emphasize that the framework is transferable. Urban corridors with fluctuating congestion patterns, mixed traffic conditions, and limited dedicated bus lanes can benefit from similar IoT-based predictive systems.
The researchers acknowledge limitations, including the absence of weather data, incident detection feeds, and expanded corridor coverage. Future improvements could integrate rainfall data, accident reports, or hybrid model architectures to further enhance accuracy. However, even in its current form, the system demonstrates that meaningful performance gains are achievable with modest technological inputs.
- READ MORE ON:
- urban bus rapid transit
- traffic congestion prediction
- travel time estimation
- IoT in transportation
- machine learning for smart cities
- ensemble learning traffic model
- GPS-based bus tracking
- Random Forest travel time prediction
- Decision Tree congestion classification
- smart public transport analytics
- FIRST PUBLISHED IN:
- Devdiscourse

