AI vs misinformation: How large language models are verifying biomedical claims

Traditional biomedical claim verification methods rely on expert reviews of scientific literature, a time-consuming and resource-intensive process. AI-powered models, particularly LLMs fine-tuned for Natural Language Inference (NLI), offer a more efficient way to assess claims. These models analyze biomedical assertions against retrieved scientific studies, classifying them into three categories: “Support,” “Contradict,” or “Not Enough Information.”

CO-EDP, VisionRI | Updated: 06-03-2025 10:25 IST | Created: 06-03-2025 10:25 IST

AI vs misinformation: How large language models are verifying biomedical claims — Representative Image. Credit: ChatGPT

In the era of rapid scientific advancements and increasing misinformation, verifying biomedical claims has become essential for healthcare decision-making, public health policies, and scientific research. False or misleading medical claims can lead to misinformed treatments, policy failures, and general distrust in medical institutions. To tackle this challenge, researchers are leveraging Large Language Models (LLMs) to create explainable, AI-driven biomedical claim verification systems that offer greater transparency, accountability, and reliability.

A recent study titled “Explainable Biomedical Claim Verification with Large Language Models” by Siting Liang and Daniel Sonntag, published in Joint Proceedings of the ACM IUI Workshops (2025), presents a novel system that integrates natural language inference (NLI), transparent AI model explanations, and user-guided justifications to improve biomedical claim verification. This system allows users to retrieve relevant scientific studies, analyze how LLMs process evidence, and verify claims with interactive, explainable AI-powered reasoning.

Role of AI in biomedical claim verification

Traditional biomedical claim verification methods rely on expert reviews of scientific literature, a time-consuming and resource-intensive process. AI-powered models, particularly LLMs fine-tuned for Natural Language Inference (NLI), offer a more efficient way to assess claims. These models analyze biomedical assertions against retrieved scientific studies, classifying them into three categories: “Support,” “Contradict,” or “Not Enough Information.”

The study introduces a Chain of Evidential Natural Language Inference (CoENLI) framework, which enhances AI reasoning by guiding LLMs to generate structured, evidence-based explanations before final classification. This framework ensures that AI-driven decisions are not only accurate but also interpretable, allowing users to trace the reasoning process behind each claim verification.

To improve accuracy, the system integrates SHAP (Shapley Additive Explanations) values, which highlight the contribution of individual words in a claim to the AI’s final decision. This transparency helps users understand why the AI reached a certain conclusion, making the verification process more trustworthy and accountable.

Evaluation and performance of the explainable AI system

The study evaluates the CoENLI framework using two biomedical benchmarks: NLI4CT (Natural Language Inference for Clinical Trials) and SciFact (Scientific Fact Verification). These datasets require AI models to process complex biomedical claims and assess their validity based on real-world clinical trials and scientific studies.

The researchers compared different AI approaches, including:

Simple Prompting (basic claim classification with no intermediate reasoning).
Zero-Shot Chain of Thought (CoT) (step-by-step reasoning without structured guidance).
CoENLI (proposed method) (structured, evidence-based inference with detailed explanations).

Results showed that CoENLI significantly outperforms traditional methods, achieving:

Higher accuracy in claim verification across both biomedical datasets.
More consistent and interpretable justifications for AI-driven decisions.
Improved user agreement and trust in AI-assisted claim verification.

The integration of multiple LLMs, including GPT-4o-mini, Llama 3.1, and Mistral-12B, further demonstrated the potential for balancing accuracy, efficiency, and transparency in biomedical AI systems.

Challenges and future improvements in AI-driven medical verification

While the study highlights the strengths of LLMs in claim verification, it also acknowledges key challenges:

One major issue is model interpretability and trust. While CoENLI improves explainability, some AI-generated justifications may still lack domain-specific nuance. Future research should explore fine-tuned AI models trained specifically on biomedical literature to enhance reasoning capabilities.

Another challenge is computational efficiency. The use of large-scale LLMs, such as GPT-4o-mini, demands high computational power, making deployment difficult in resource-constrained environments. To address this, researchers propose hybrid models that combine lightweight LLMs with fine-tuned medical knowledge bases.

Additionally, ensuring AI fairness and bias reduction remains a priority. The study highlights that small variations in claim wording can sometimes lead to inconsistencies in verification results. By refining training data diversity and integrating human feedback loops, AI systems can become more robust and reliable.

Future of AI-powered biomedical fact-checking

The integration of LLMs, explainable AI, and evidence-based verification represents a major leap forward in biomedical claim validation. This research not only advances AI-assisted decision-making in healthcare and research but also sets the foundation for more accountable, human-AI collaboration frameworks.

Looking ahead, the researchers propose further optimization of LLM reasoning strategies, enhanced feedback mechanisms, and integration with broader evidence synthesis frameworks. As AI technology continues to evolve, its role in fighting medical misinformation, supporting clinical research, and aiding policy decisions will become increasingly vital.

By making biomedical claim verification more transparent, explainable, and reliable, this study paves the way for trustworthy AI applications in the healthcare sector - ensuring that scientific knowledge remains accurate, credible, and accessible to all.

FIRST PUBLISHED IN:
Devdiscourse

AI vs misinformation: How large language models are verifying biomedical claims

Role of AI in biomedical claim verification

Evaluation and performance of the explainable AI system

Challenges and future improvements in AI-driven medical verification

Future of AI-powered biomedical fact-checking

TRENDING

Aussies Stumble Early as Archer Shines at Ashes

Government Admits Fault in Tragic Aviation Collision

Trump's Third Term Dream: Constitutional Twist or Political Tease?

Trump's Consideration to Reclassify Marijuana Could Spark Cannabis Industry ...

OPINION / BLOG / INTERVIEW

Truth crisis in AI era is human, not technological

Ethics must catch up with rapid adoption of generative AI in higher education research

Financial institutions turn to adaptive AI to close fraud detection gaps

Industry 4.0 and 5.0 technologies reshaping food service operations

DevShots

Latest News

Strengthening research, use of digital tech, trusted regulatory framework will further bolster traditional medicine: PM Modi at WHO meet.

BOJ Rate Hike Fuels Market Reactions: Yen Weakens, Global Stocks Muted

India and Netherlands Boost Bilateral Cooperation in Emerging Sectors

Pentagon's Persistent Accountability Problem: Eighth Audit Failure

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT