Self-driving AI still struggles when traffic, fog and night conditions combine
Autonomous vehicles (AVs) can handle rain, fog and night driving in controlled simulations, but their performance weakens sharply when multiple urban risks converge, according to a new study published in Electronics.
The study, titled Artificial Intelligence for Autonomous Vehicles: Robustness Analysis in Complex Urban Traffic Scenarios, built and tested a modular autonomous-driving system in the CARLA Town10HD simulator, combining CNN-based perception, A* global path planning, and two control strategies: a classical PID controller and a Deep Q-Network reinforcement learning agent with adaptive steering assistance.
The findings show that AVs can appear highly capable under isolated disturbances but remain vulnerable when visibility, weather, traffic and control demands interact. The study reports strong perception results in simulation, with classification scores close to 0.99 for traffic signs, pedestrians, lanes and scene geometry. However, the authors caution that these results come from synthetic CARLA data and should not be treated as proof of real-world performance.
AI perception performs well in simulation, but real-world transfer remains unproven
The research team developed a modular autonomous-driving pipeline built around perception, planning and control. The system used simulated RGB, semantic segmentation and depth sensors in CARLA, a high-fidelity autonomous-driving simulator widely used for repeatable testing. The vehicle operated in Town10HD, a complex urban environment with intersections, pedestrian zones, multi-lane roads, traffic lights and varied signage.
For perception, the researchers used ResNet-18 convolutional neural networks to classify traffic signs, pedestrians and lanes, and to identify scene geometry such as curves, intersections and straight roads. Classical image-processing methods were also added for lane-boundary detection, using edge and line extraction to support lateral-control calculations.
The perception models performed strongly within the simulation environment. The object classifier reached 99 percent accuracy on a balanced test subset of 1,970 synthetic samples. The scene classifier also reached 99 percent accuracy across curve, intersection and straight-road categories. These numbers indicate that the system learned the simulated visual environment effectively.
The study repeatedly stresses that the performance must be interpreted with caution. Both training and testing relied on CARLA-generated synthetic imagery, meaning the system did not face real-world domain shift. Real driving images include sensor noise, lens distortion, motion blur, changing road surfaces, unpredictable lighting, dirt, weather artifacts and behavior that differs from simulation. The paper therefore does not claim real-world generalization.
Autonomous-driving research often faces a gap between simulation and deployment. A vehicle that performs well in a simulator may fail when exposed to real cameras, irregular road markings, unpredictable pedestrians or sensor calibration problems. The authors identify simulation-to-reality transfer as a major future challenge, not a solved problem.
The study did include a preliminary domain-randomization experiment, varying lighting, weather, camera noise and textures during training. Under combined disturbances, this improved route efficiency and reduced lateral error and collisions compared with a fixed-condition baseline. Even so, the research remains simulation-based and does not validate the system on physical vehicles.
PID stays smoother while reinforcement learning moves faster but less steadily
The researchers tested a classical PID controller against a DQN reinforcement-learning agent under the same routes, initial conditions and environmental disturbances.
The PID controller represented a traditional control method valued for stability and interpretability. It used proportional, integral and derivative terms to regulate speed and steering. The DQN agent represented a learning-based approach, using reinforcement learning to select driving actions. It was supported by an adaptive geometric steering-assistance module designed to stabilize sharp turns and high-interaction moments.
The comparison exposed a clear trade-off. The PID controller delivered smoother, more consistent trajectories. It showed lower steering oscillations, steadier acceleration behavior and more reliable lane-keeping. The DQN agent completed routes faster but produced more abrupt control actions, higher directional variability and less consistent driving behavior.
The study's average control results underline the gap. The PID controller achieved a higher reward score than the DQN agent, reflecting more stable navigation. The DQN achieved shorter traversal times, but the authors note that this speed came at the cost of smoothness and stability. Both controllers completed the evaluated episodes without collisions in the direct controller comparison, but their driving quality differed sharply.
Statistical tests supported those differences. The study reports significant differences in steering smoothness and lateral error between PID and DQN, indicating that the DQN's higher variability was not random run-to-run noise but a structural difference in control behavior.
The results do not dismiss reinforcement learning. Instead, they show that learning-based control remains difficult to stabilize in urban-driving settings when the state representation is simplified and the reward function does not fully capture environmental complexity. The DQN agent in this study used a low-dimensional state vector with position, yaw and speed, along with four discrete actions. This made the comparison with PID more controlled and interpretable, but it also limited the agent's ability to understand richer driving context.
The authors argue that future RL controllers may need better state representations, visual or semantic inputs, continuous action spaces and more advanced actor-critic algorithms such as PPO or SAC. In its current form, however, the learning-based controller showed potential but did not match the stability of the classical controller.
Extreme weather and dense traffic reveal the system's weak points
The researchers tested the system in heavy rain, dense fog, night driving, dense traffic and combined extreme conditions. Each scenario used 50 independent simulation runs, providing a structured comparison of how the integrated system behaved under different urban stressors.
Under isolated weather disturbances, the system performed well. Heavy rain, dense fog and night driving each produced a 98 percent success rate, zero collisions and route efficiency of about 0.96. Lateral error remained in a narrow range, suggesting that the vehicle could still follow routes under individual visibility or weather challenges in simulation.
The picture was different when risks became dynamic or compounded. In dense traffic, the success rate dropped to 52 percent, route efficiency fell to 0.52, and 24 collision ticks were recorded. The scenario involved tight inter-vehicle spacing, requiring the vehicle to respond to other road users. The results point to a critical weakness: the system could follow a planned path, but struggled when multi-agent interactions demanded anticipation, negotiation and rapid adaptation.
The combined extreme-conditions scenario was also severe. It mixed heavy rain, dense fog, nighttime conditions, wind and dynamic traffic. The success rate fell to 50 percent, route efficiency dropped to 0.50, and 23 collision ticks were recorded. The system became more conservative, but that did not prevent failures under compounded uncertainty.
The study's statistical analysis confirmed that environmental complexity had a measurable effect on system performance. Route efficiency, overtravel ratio, steering smoothness and acceleration smoothness differed significantly across scenarios. Dense traffic and extreme conditions produced the largest degradation.
This matters because real urban driving rarely presents one clean challenge at a time. A vehicle may face rain, poor lighting, construction, pedestrians, glare, congestion and unpredictable drivers simultaneously. The study suggests that testing systems only under isolated conditions may overstate their readiness.
The authors also acknowledge important limits. The perception module was not separately evaluated under each adverse scenario, so the study cannot isolate exactly how much each weather condition degraded visual classification. The work also did not include direct ablation experiments to measure the individual contribution of each component. It did not test real hardware, real roads or real-world datasets. Therefore, the findings should be read as a controlled simulation benchmark, not deployment evidence.
- FIRST PUBLISHED IN:
- Devdiscourse
Google News