Explainable AI offers cities new tool to target GBV prevention

The strong role of socioeconomic factors underscores the limits of purely spatial or technical solutions. While predictive models can identify where risk concentrates, addressing why it does so requires broader social and economic policies. Employment opportunities, education access, and community support remain central to long-term prevention.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 26-12-2025 09:18 IST | Created: 26-12-2025 09:18 IST
Explainable AI offers cities new tool to target GBV prevention
Representative Image. Credit: ChatGPT

Urban gender-based violence (GBV) is not randomly distributed across city space, and new research shows that advanced geospatial machine learning can identify where risks concentrate with striking accuracy. By combining open urban data with explainable AI models, researchers demonstrate that everyday factors such as traffic intensity, socioeconomic conditions, and nightlife density play a far greater role in shaping gender-based violence risk than the presence of police stations or hospitals. 

The study, titled Developing a Predictive Model for Gender-Based Violence in Urban Areas Using Open Data, and published in the journal Geomatics, applies geospatial machine learning to the city of Valencia, Spain, using open data sources to build a predictive model capable of identifying high-risk urban areas with a high degree of precision.

Turning open urban data into predictive insight

GBV in urban contexts is shaped by a complex mix of social, economic, and environmental factors. Traditional statistical approaches often fail to capture this complexity, relying on linear relationships that struggle to reflect how multiple variables interact across space. The study addresses this limitation by applying machine learning techniques specifically designed to model nonlinear patterns in large, heterogeneous datasets.

The researchers focused on Valencia, a dense Mediterranean city with clear socioeconomic contrasts across neighborhoods. They assembled a large dataset of emergency call records related to gender-based violence, domestic violence, assault, and sexual crimes spanning several years. After careful anonymization and georeferencing, more than 42,000 incidents were mapped across the urban area.

To ensure comparability between crime data and urban indicators, the team divided the entire city into a uniform grid with cells measuring 25 meters on each side. This fine-grained spatial framework allowed diverse datasets, including income levels, unemployment rates, education levels, real estate prices, traffic intensity, nightlife venues, green spaces, and access to services, to be analyzed within the same spatial unit.

This approach marks a significant departure from neighborhood-level analysis, which often masks street-level variation. By working at a scale closer to how people actually experience urban space, the model captures subtle but meaningful differences in risk between adjacent areas.

Three modeling approaches were tested. A traditional linear regression model was used as a baseline, while two tree-based machine learning models, Decision Trees and Random Forests, were applied to capture complex interactions between variables. The results showed a clear hierarchy in performance. Linear regression explained less than half of the observed spatial variation, confirming its limitations for this type of phenomenon. In contrast, tree-based models performed far better, with Random Forest delivering the strongest and most stable predictions.

The Random Forest model was able to explain nearly all observed spatial variation in gender-based violence risk, accurately reproducing known hotspots while also identifying areas of elevated risk that do not always appear in raw crime counts. This ability to generalize beyond observed incidents is central to the study’s contribution, as it enables proactive planning rather than reactive policing.

What drives urban gender-based violence risk

The research focuses on interpretability. Predictive accuracy alone is not sufficient for public policy applications, particularly in sensitive areas such as gender-based violence. To address this, the researchers applied explainable AI techniques that quantify how much each variable contributes to the model’s predictions and in which direction.

The results challenge several common assumptions. Traffic intensity emerged as the most influential predictor by a wide margin. Areas with higher traffic volumes consistently showed higher predicted risk, suggesting that mobility patterns and transient populations play a major role in shaping where violence occurs. Busy streets, transport corridors, and highly accessible areas may create conditions that increase exposure and opportunity for violence.

Socioeconomic variables formed the next tier of importance. Higher unemployment rates and lower average monthly income were strongly associated with increased risk, while higher income levels showed a protective effect. Educational attainment also mattered, with areas characterized by lower levels of education showing higher predicted risk. These findings reinforce the link between structural inequality and gender-based violence, placing socioeconomic vulnerability at the center of spatial risk patterns.

Nightlife density also emerged as a significant contributor. Proximity to pubs and clubs increased predicted risk, reflecting the interaction between alcohol consumption, nighttime activity, and social environments where harassment and assault are more likely to occur. Importantly, nightlife did not act in isolation. Its influence was strongest in areas where it intersected with high traffic intensity and socioeconomic vulnerability.

On the other hand, several variables often assumed to enhance safety showed little to no predictive power. Proximity to police stations, hospitals, 24-hour medical services, and green areas had minimal influence on predicted risk once other factors were taken into account. This does not imply that such institutions are unimportant, but rather that their presence alone does not determine where gender-based violence concentrates.

Real estate prices played a moderate role, with higher property values associated with lower predicted risk. This finding aligns with broader patterns linking economic advantage to reduced exposure to certain forms of urban violence, though the authors caution against interpreting this as a simple causal relationship.

The results suggest that gender-based violence risk is shaped less by formal institutional coverage and more by everyday urban conditions, mobility flows, and social inequality. This perspective reframes prevention as a spatial and structural challenge rather than solely a policing issue.

Implications for urban safety policy and prevention

By demonstrating that risk can be predicted with high accuracy using open data and explainable models, the research opens the door to more targeted, evidence-based interventions.

Rather than distributing resources evenly or responding only after incidents occur, cities could use predictive risk maps to focus prevention efforts where they are most needed. These could include targeted lighting improvements, safer transport planning, urban design changes, outreach services, and situational prevention measures in high-risk streets and districts.

The emphasis on traffic intensity and mobility highlights the importance of integrating transport and urban planning into violence prevention strategies. Managing crowd flows, improving surveillance in high-traffic areas, and designing safer pedestrian environments could have a direct impact on reducing risk.

The strong role of socioeconomic factors underscores the limits of purely spatial or technical solutions. While predictive models can identify where risk concentrates, addressing why it does so requires broader social and economic policies. Employment opportunities, education access, and community support remain central to long-term prevention.

The study also illustrates the value of explainable AI in public policy contexts. By clearly showing which factors drive predictions, the model avoids the black-box problem that often undermines trust in AI systems. Decision-makers can see not only where risk is high, but also which conditions contribute most strongly in each area, supporting tailored interventions rather than one-size-fits-all responses.

At the same time, the authors are careful to outline the study’s limitations. The model relies on police and emergency call data, which underrepresent many forms of harassment and violence that go unreported. As a result, predicted risk reflects recorded incidents rather than the full lived experience of urban safety. The research also focuses on static spatial variables and does not yet incorporate temporal dynamics such as time of day, seasonality, or special events, which are known to influence violence patterns.

Future work, the authors suggest, should integrate participatory data sources, such as citizen reports and crowd-sourced safety mapping, to complement official records. Incorporating temporal patterns and street-level analysis could further refine predictions and support more precise interventions.

Despite these limitations, the study demonstrates that geospatial machine learning has reached a level of maturity that makes it a viable tool for urban safety planning. By grounding predictions in open data and prioritizing interpretability, the approach balances technical rigor with ethical and practical considerations.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback