Bridging the gap: How AI and humans can make better decisions together

The study introduces a new framework for evaluating the role of information in human-AI decision-making. Rather than simply comparing human and AI accuracy, the framework measures the "complementary information value" - the extent to which new information can improve decision outcomes when incorporated into an AI-assisted workflow.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 14-02-2025 17:08 IST | Created: 14-02-2025 17:08 IST
Bridging the gap: How AI and humans can make better decisions together
Representative Image. Credit: ChatGPT

As AI continues to permeate critical decision-making domains such as healthcare, finance, and law enforcement, the challenge of integrating human expertise with AI-driven predictions has become a major area of research. While AI models frequently outperform humans in pattern recognition and statistical accuracy, human intuition, contextual knowledge, and ethical considerations remain invaluable in complex decision-making tasks. However, studies suggest that human-AI teams often underperform AI models alone, raising concerns about how best to optimize this collaboration.

A recent study, "The Value of Information in Human-AI Decision-Making", authored by Ziyang Guo, Yifan Wu, Jason Hartline, and Jessica Hullman, and published in arXiv (2025), introduces a decision-theoretic framework to assess the value of information in human-AI workflows. This research aims to identify how human and AI contributions can be leveraged most effectively to enhance decision accuracy, improve explainability, and optimize AI-assisted workflows.

Quantifying information value in human-AI teams

The study introduces a new framework for evaluating the role of information in human-AI decision-making. Rather than simply comparing human and AI accuracy, the framework measures the "complementary information value" - the extent to which new information can improve decision outcomes when incorporated into an AI-assisted workflow.

The researchers propose two key metrics:

  • Global Human-Complementary Information Value: This metric calculates the overall impact of additional information on a decision-maker’s performance.
  • Instance-Level Human-Complementary Information Value: This evaluates how individual pieces of information contribute to decision-making on a case-by-case basis.

By applying Bayesian decision theory, the study models how a rational decision-maker would optimally integrate information from both humans and AI to maximize decision quality. If an AI model provides critical insights not utilized by humans, then enhancing the interpretability and presentation of AI outputs could improve overall performance. Conversely, if human experts possess valuable contextual knowledge that AI systems do not capture, AI training data could be expanded to incorporate these insights.

The framework was tested on three key decision-making tasks: chest X-ray diagnosis, deepfake detection, and recidivism prediction. Across all tasks, the findings suggest that improving information exchange between humans and AI models could significantly enhance team performance.

Evaluating AI assistance in medical diagnosis and deepfake detection

To demonstrate the effectiveness of the proposed framework, the researchers conducted experiments in AI-assisted chest X-ray diagnosis and deepfake detection.

In the medical diagnosis experiment, AI models were evaluated for their ability to assist radiologists in diagnosing cardiac dysfunction using chest X-ray images from the MIMIC dataset. The study analyzed five AI models, including VisionTransformer, SwinTransformer, ResNet, Inception-v3, and DenseNet. The results revealed that all AI models provided complementary information beyond human judgment, capturing at least 20% of the total available decision-relevant information that human radiologists did not utilize. Among the models, VisionTransformer provided the highest complementary information, making it the most effective model for assisting human experts.

The second experiment focused on deepfake detection, where participants were tasked with identifying whether a video was AI-generated. Human participants initially made independent judgments before being provided with AI-generated predictions. Interestingly, while AI predictions were highly informative, human participants struggled to effectively calibrate their confidence in AI recommendations.

The findings revealed that human-AI teams only utilized 30% of the available decision-relevant information, largely because humans failed to differentiate between correct and incorrect AI recommendations. This suggests that simply providing AI assistance is insufficient - humans need better tools to interpret and act on AI-generated insights.

Enhancing explainability in high-stakes AI decisions

The third experiment explored recidivism prediction, where decision-makers assessed whether a defendant was likely to be rearrested. AI systems like COMPAS are widely used in criminal justice settings, but they have been criticized for bias, lack of transparency, and questionable fairness. To address these concerns, the study introduced a novel instance-level explanation technique - ILIV-SHAP - which extends SHAP explanations to highlight the portion of an AI’s prediction that uniquely complements human judgment.

Traditional AI explanations focus on why an AI made a specific decision, but ILIV-SHAP explains how AI-generated insights improve upon human judgment. The study found that certain features, such as prior convictions, race, and charge severity, were disproportionately weighted by either AI models or human decision-makers. The ILIV-SHAP framework provided a more transparent method for evaluating how AI decisions are made and how they interact with human biases, paving the way for more ethical and interpretable AI-assisted decision-making.

By enhancing explainability, the study suggests that human-AI teams could make better-informed decisions and mitigate risks associated with algorithmic bias. This is particularly important in fields like criminal justice, hiring, and medical diagnosis, where poor AI recommendations can lead to significant real-world consequences.

The future of human-AI collaboration: Optimizing decision synergy

This research provides a groundbreaking approach to improving AI-assisted decision-making by identifying information gaps, optimizing human-AI interactions, and refining AI-generated explanations. The findings underscore several key takeaways:

  • AI should be designed to complement human expertise, rather than simply replace it.
  • Better tools are needed to help humans interpret and integrate AI recommendations.
  • Transparency and explainability are critical in high-stakes AI applications.
  • Future AI training should incorporate human-generated insights to enhance complementary decision-making.

Looking ahead, the authors advocate for further real-world studies and randomized controlled trials (RCTs) to refine AI-assisted decision workflows. By developing AI systems that understand and adapt to human cognitive processes, we can create decision pipelines that maximize both efficiency and accuracy.

As AI becomes more embedded in critical decision-making, the ability to effectively balance human intuition with machine intelligence will be a defining factor in the future of AI-human collaboration.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback