AI is distorting history through biased heritage datasets

Adversarial debiasing involves training two models simultaneously: one for classification and one that attempts to predict a sensitive attribute, such as gender or ethnicity. If the second model succeeds, the system is retrained until it cannot accurately detect the biased attribute, signaling a reduced dependency on that attribute for classification.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 01-05-2025 17:11 IST | Created: 01-05-2025 17:11 IST
AI is distorting history through biased heritage datasets
Representative Image. Credit: ChatGPT

As artificial intelligence increasingly transforms how museums, archives, and institutions catalog cultural artifacts, a new study uncovers the hidden cycle of bias that is distorting digital heritage collections. The research, “Tracing the Bias Loop: AI, Cultural Heritage and Bias-Mitigating in Practice”, published in AI & Society, presents a comprehensive analysis of how bias infiltrates artificial intelligence systems used in the curation and classification of cultural heritage data. From algorithmic design to dataset selection, the study shows that bias is not a glitch but an embedded structural loop, requiring both humanistic and technical interventions to dismantle.

How does bias enter the AI pipeline in cultural heritage collections?

The study maps out a bias loop composed of three interconnected stages: data-to-algorithm, algorithm-to-user, and user-to-data. It illustrates that bias originates during the earliest phases of dataset creation, where choices about what to include or exclude already reflect entrenched historical and social inequalities. For example, museum records often over-represent dominant cultural narratives while under-representing marginalized voices such as Indigenous, LGBTQIA+, or non-Western communities. These imbalances are compounded when machine learning models are trained on such data without awareness or mitigation strategies.

Image classification algorithms used in the cultural sector inherit and often amplify these distortions. A seemingly neutral model trained on gendered images may misclassify men as women simply because it associates long hair or jewelry with femininity. Digitization practices themselves introduce bias through selective preservation, museums may digitize popular artifacts while ignoring less prominent yet culturally significant items. Once the AI model is deployed, these patterns are not only maintained but potentially worsened, reinforcing existing societal narratives through algorithmic outputs.

The researchers argue that while AI is often presented as a tool for democratizing heritage access, the opposite may occur when systems are built upon uncritical datasets. Even as Europe promotes AI in cultural heritage through policies and funding frameworks, the study critiques the lack of attention to how AI may reinforce exclusion, especially when datasets originate from colonial or biased archives.

What are the technical and non-technical methods to mitigate bias?

To address the structural nature of AI bias, the study outlines both technical solutions and collaborative design principles. Among the technical methods, three are especially effective: data augmentation, adversarial debiasing, and active monitoring plans.

Data augmentation involves modifying existing images through techniques such as color jittering, flipping, and noise injection to balance underrepresented groups within a dataset. This helps create more equitable models by reducing overfitting to dominant categories. In tests using two datasets, the Architectural Heritage Elements dataset and the Swedish World Culture Museum’s photographic archive, noise injection significantly improved model performance and reduced classification bias. Color jittering and flipping also yielded strong results, confirming their effectiveness as standard debiasing strategies.

Adversarial debiasing involves training two models simultaneously: one for classification and one that attempts to predict a sensitive attribute, such as gender or ethnicity. If the second model succeeds, the system is retrained until it cannot accurately detect the biased attribute, signaling a reduced dependency on that attribute for classification. While common in industry and social AI applications, this method is still rarely applied in cultural heritage contexts, though the study demonstrates its promising potential for correcting embedded bias in visual collections.

Equally critical are non-technical interventions. The authors advocate for hybrid teams of subject matter experts, data scientists, curators, and social scientists who can collaboratively flag skewed annotations, culturally insensitive descriptors, or gaps in representation during dataset construction. Human monitoring should continue post-deployment, with ongoing evaluations to ensure AI models do not drift toward biased outputs as datasets evolve. Participatory methodologies, like those used in the DE-BIAS project, involve affected communities in curating and auditing metadata, offering a model for inclusive digital stewardship.

The paper highlights that many cultural institutions lack these safeguards. New technologies are often outsourced, with minimal involvement from internal staff or affected communities. Once technical support ends, these systems become outdated and underutilized. Embedding bias mitigation into both the design and deployment phases is necessary to maintain ethical integrity and cultural relevance.

What role should interdisciplinary collaboration play in ethical AI for heritage?

The authors argue that solving the bias loop requires more than just better algorithms - it demands rethinking the relationships between heritage professionals, technologists, and the public. Bias is not merely a computational flaw but a societal echo, carried into digital systems through historical structures of knowledge and exclusion. Addressing it requires interdisciplinary teams where responsibilities are role-based, clearly distributed, and mutually accountable.

The study emphasizes that many museums and heritage organizations operate under tight budgets and bureaucratic constraints, particularly in low- and middle-income countries. Therefore, scalable bias mitigation strategies, like open-source augmentation libraries and training modules for curators, must be prioritized. The role of policymakers is also highlighted, especially in enforcing guidelines that go beyond access and digitization to include diversity, equity, and cultural sensitivity.

Cultural heritage datasets are not ahistorical collections of facts; they are shaped by geopolitical forces, colonial histories, and curatorial decisions made over centuries. Applying AI to these archives without interrogating their origins risks producing models that appear accurate but reflect a distorted view of culture. From the mislabeling of historical gender representations to the underrepresentation of non-European artifacts, bias takes many forms that cannot be corrected through code alone.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback