AI failures stem from human misuse: It’s not the machine’s fault

In misapplication failures, AI is simply the wrong tool for the job. One stark example is Hackney Council’s failed attempt to predict child abuse using algorithmic risk scoring. The model used administrative data, like housing repair requests or school attendance, as proxies for abuse risk. But these inputs lacked the granularity and context required for meaningful child protection interventions. Worse, the attempt redefined abuse as a quantifiable problem, stripping it of relational and moral complexity.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 21-04-2025 09:06 IST | Created: 21-04-2025 09:06 IST
AI failures stem from human misuse: It’s not the machine’s fault
Representative Image. Credit: ChatGPT

Artificial intelligence failures are being misunderstood as technical faults, when many are actually the consequence of misusing computational machines for problems they are not built to solve, says a provocative new study published in AI & Society. Titled "Not the machine’s fault: taxonomising AI failure as computational (mis)use," the research reframes common narratives around AI gone wrong by introducing a taxonomy that reassigns blame from faulty machines to flawed human purposes and expectations.

Drawing on decades of computing history and philosophical foundations, the paper categorizes AI failures into four types: technically sound but socially inappropriate outputs; machine-world misconfiguration; motivational failures rooted in corporate or ideological goals; and epistemic failures where computing is inappropriately applied to problems it cannot meaningfully resolve. Rather than focusing solely on data quality or algorithmic bias, the study argues for a fundamental reappraisal of how, when, and why we use AI at all.

What kinds of failures are not the machine’s fault?

Shen challenges the narrative that all AI failures stem from faulty coding or model breakdown. In fact, some of the most controversial incidents - like hallucinations in chatbots or offensive content generation - emerge precisely because the AI is working as designed. This category, called "technically sound failure," involves models that produce outputs in line with their architecture and training, but those outputs are socially or ethically unacceptable.

The now-infamous example of Google Gemini producing images of Black Nazis illustrates this type of failure. The system did not err in its technical operation; rather, its connectionist logic, where all inputs are treated as numerical patterns devoid of social meaning, led to a scenario where skin color and fascist symbology were conflated without any moral evaluation. The same logic underlies image leaks caused by large language models trained on sensitive corporate data: the models retained statistically probable information patterns, but lacked the contextual awareness to distinguish proprietary material from public data.

These failures, Shen argues, arise from a deeper limitation in connectionist AI. Unlike symbolic AI, which encodes rules and logic trees, connectionist models operate purely through probabilistic pattern recognition. As such, they are agnostic to meaning and normative judgment. The machine doesn’t "know" what’s offensive or private, it simply matches and generates according to numerical relationships. Thus, hallucinations are not flaws but features of how neural networks operate.

What happens when machines are placed in the wrong environments?

The second major question the author addresses is how misalignment between machine configurations and real-world environments leads to failure, not because of malfunction, but due to inappropriate fit. These "machine-world failures" occur when AI is deployed in scenarios that exceed the boundaries of its training data, design assumptions, or sensory capabilities.

High-profile cases like Tesla Autopilot’s inability to detect a white truck against a bright sky, or food delivery robots entering active crime scenes, illustrate failures of environmental anticipation. The machine does not err in terms of logic, but it lacks the representational flexibility that humans bring to ambiguous, shifting contexts. Another example is facial recognition software failing to detect people of color, a reflection of training data biases rather than inherent hardware flaws.

Shen emphasizes that these failures stem from what philosophers of computing call the "dimensionality problem." Computers can only recognize what they are trained to see, and that training often omits unspoken cultural or contextual cues. A robot vacuum photographing bathroom users is not broken; it simply lacks any concept of social boundaries. To the machine, all rooms are coordinates; to a human, some are private by social convention, not physical marker.

Such cases reveal the limits of computational epistemology: AI cannot intuit privacy, discretion, or contextual appropriateness unless those concepts are somehow numerically encoded, an impossible task in many social domains.

Why are we using machines for problems they cannot solve?

At the heart of Shen’s argument is a third and more damning critique: that AI failure often results from human decisions to apply computation where it does not belong. These “motivational” and “misapplication” failures expose how economic incentives, political ambitions, or misplaced faith in technology drive misuse.

In motivational failures, AI tools work exactly as designed but cause harm because the design itself reflects unethical or shortsighted priorities. Amazon’s driver surveillance system, which allegedly incentivized unsafe driving practices, and scheduling software that destabilized shift work at Starbucks, exemplify this pattern. The systems functioned, but their design exacerbated labor precarity and safety risks in the name of efficiency.

In misapplication failures, AI is simply the wrong tool for the job. One stark example is Hackney Council’s failed attempt to predict child abuse using algorithmic risk scoring. The model used administrative data, like housing repair requests or school attendance, as proxies for abuse risk. But these inputs lacked the granularity and context required for meaningful child protection interventions. Worse, the attempt redefined abuse as a quantifiable problem, stripping it of relational and moral complexity.

Shen warns that such misapplications are especially dangerous because they rest on a “computational ontology” that falsely assumes all problems are solvable through calculation and categorization. This mindset, dubbed "techno-solutionism," drives the deployment of AI in domains like education, healthcare, law enforcement, and social services, often with damaging results. Computers were not built to negotiate moral nuance, human suffering, or institutional injustice. They process, categorize, and tabulate - not reason, reflect, or reconcile.

Are we asking too much of AI and the wrong things entirely?

The paper calls for a radical rethink of AI’s role in society. Shen urges developers, policymakers, and the public to resist the narrative that AI is a universally applicable solution. Instead, she proposes a framework for evaluating when computational tools are appropriate, and when they should be set aside in favor of non-digital, human-centered approaches.

This includes questioning whether a problem is reducible to numerical modeling, whether a machine’s outputs align with real-world goals, and whether the deployment of AI introduces cascading harms or hides structural issues. In areas like cancer screening, predictive policing, or welfare allocation, AI’s precision may be technically impressive, but socially counterproductive.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback