Medical AI fails key ethical test without inclusive, epistemic oversight

The EEM integrates two sets of principles: ethical ones that include well-being, autonomy, justice, and explicability, and epistemic ones that incude accuracy, consistency, relevance, and instrumental efficacy. Unlike traditional frameworks that often emphasize abstract ethical ideals, the EEM centers its evaluation on stakeholder-specific goals and perceptions, offering a more granular understanding of AI's impact.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 27-05-2025 09:15 IST | Created: 27-05-2025 09:15 IST
Medical AI fails key ethical test without inclusive, epistemic oversight
Representative Image. Credit: ChatGPT

In the rapidly evolving field of artificial intelligence (AI) in healthcare, ethical guidelines alone are proving insufficient to capture the complexity of its societal impact. A new framework - the Ethical-Epistemic Matrix (EEM) - seeks to fill that gap by systematically evaluating both moral and knowledge-based dimensions of AI use in medicine. This dual-pronged assessment model is presented in the study titled "Ethical and Epistemic Implications of Artificial Intelligence in Medicine: A Stakeholder-Based Assessment", published in AI & Society in May 2025.

Developed by Jonathan Adams at the University of Oslo, the EEM is applied for the first time in this paper to a high-stakes domain: AI-driven sleep apnea detection. The research positions the EEM as a tool not just for ethical compliance but for epistemic accountability, encouraging a reevaluation of AI’s real-world value across different stakeholder groups: patients, clinicians, developers, the public, and health policy-makers.

How does EEM reshape the assessment of medical AI?

The EEM integrates two sets of principles: ethical ones that include well-being, autonomy, justice, and explicability, and epistemic ones that incude accuracy, consistency, relevance, and instrumental efficacy. Unlike traditional frameworks that often emphasize abstract ethical ideals, the EEM centers its evaluation on stakeholder-specific goals and perceptions, offering a more granular understanding of AI's impact.

For instance, in evaluating well-being, the matrix contrasts the potential of AI to extend lifespan through faster diagnoses with the risk of eroding meaningful clinician-patient relationships. While patients may benefit from early intervention, clinicians may suffer increased moral distress if AI systems override their judgment. Developers face psychological strain from labeling sensitive training data, while the public confronts broader issues like sustainability and societal well-being. Policymakers, meanwhile, may find their professional goals misaligned if AI tools prioritize efficiency over holistic health outcomes.

Autonomy, a cornerstone of medical ethics, is also reevaluated. Patients may lose agency in AI-dominated diagnostic processes, while physicians face a dilution of their decision-making authority. Developers’ autonomy intersects with regulatory frameworks that could either stifle innovation or promote ethical creativity. The public’s autonomy is challenged by opaque data repurposing, and policymakers must navigate between independent decision-making and pressure from private tech firms.

Justice is reinterpreted through the lens of both distributive and epistemic fairness. From patient-facing algorithmic bias to the structural inequities developers in the Global South face, justice is not just about access but also about whose knowledge and needs are privileged. Explicability, often seen as a technical hurdle, becomes an ethical necessity, ensuring transparency not just for regulatory compliance but for meaningful accountability to all stakeholders.

What epistemic challenges does AI in medicine present?

Beyond ethics, Adams’ framework uniquely captures how stakeholders value knowledge differently. For patients, accuracy serves emotional reassurance, while clinicians require diagnostic reliability for professional confidence. Developers seek validation tailored to end-user needs, and the public’s concern spans both factual correctness and the economic implications of AI errors in publicly funded systems. For policymakers, accuracy is foundational but not always sufficient; standards must also address misinformation and health outcome efficacy.

Consistency, another epistemic pillar, is linked to trust. Patients need algorithms that behave uniformly across populations to avoid bias, while clinicians demand reliability for integrating AI into practice. Developers benefit from consistent outputs that make iterative improvements feasible. The public relies on generalizability, and policymakers cannot govern unpredictable technologies.

Relevance, epistemic information that meets specific informational needs, is heavily stakeholder-dependent. AI that ignores patient demographics or clinician workflows risks becoming epistemically irrelevant. Developers face the danger of information overload through ‘hyper-relevance’ that overwhelms users. Public exposure to only personalized, AI-filtered health data raises concerns of informational silos. Policy-makers require data that aligns with macro-level health goals, not just statistical significance.

Instrumental efficacy,  how well AI serves its intended epistemic purpose, also varies widely. AI may offer early diagnosis but fail to improve outcomes if integration with human expertise is poor. For clinicians, epistemic tools must reinforce, not replace, their reasoning. Developers require systems adaptable to new data, while public trust hinges on AI’s visible, actionable benefits. Policymakers need assurance that AI not only works in theory but improves treatment effectiveness, cost efficiency, and population health in practice.

What does the case of AI-driven sleep apnea detection reveal?

To operationalize the EEM, Adams applies it to a real-world use case: AI-enabled detection of obstructive sleep apnea (OSA). OSA is a common, underdiagnosed disorder associated with serious health risks. Traditional detection methods are costly and inaccessible to many. AI offers promise in expanding access via wearables and mobile apps that analyze physiological data during sleep. However, when run through the EEM framework, the results underscore the complexity of such a technological solution.

For patients, AI improves screening accessibility but also introduces new risks: false positives cause anxiety and false negatives delay care. Clinicians may appreciate automated triaging but fear a reduction in their autonomy and an increase in unnecessary workload. Developers must grapple with balancing model performance and ethical design, particularly regarding data bias. Public engagement hinges on understanding how AI makes decisions and whether these tools democratize or entrench health inequities. Policymakers face tough questions around regulation, especially in determining acceptable uncertainty thresholds and ensuring equitable rollout.

From an epistemic standpoint, OSA detection tools are often judged by their accuracy alone. Yet the EEM reveals additional concerns: Are the results consistent across diverse users and conditions? Are the insights relevant to actual patient needs or just technically impressive? Can AI outputs be meaningfully integrated with human clinical judgment? And critically, does the tool contribute to better health outcomes or merely shift the burden of diagnosis?

These questions elevate the EEM from a theoretical exercise to a practical necessity, encouraging deeper reflection before AI systems are deployed in healthcare.

The EEM emerges not merely as a tool for ethical introspection but as a stakeholder-sensitive instrument for anticipating real-world trade-offs in AI deployment.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback