Explainable AI in high-stakes decisions: A roadmap for better human-AI collaboration

The study involved a meal-planning task for diabetes management, where participants had to select the most suitable meal for blood sugar control. The decision-making process was structured in three phases. In the baseline phase, participants made meal choices independently, without AI assistance.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 01-02-2025 17:00 IST | Created: 01-02-2025 17:00 IST
Explainable AI in high-stakes decisions: A roadmap for better human-AI collaboration
Representative Image. Credit: ChatGPT

As AI becomes increasingly embedded in high-stakes decision-making domains like healthcare, finance, and security, understanding how users interact with AI-generated recommendations is crucial. While AI systems offer advanced analytical capabilities, studies suggest that human-AI teams often underperform compared to AI alone due to a phenomenon known as automation bias - where users over-rely on AI, even when it is incorrect. On the other hand, under-reliance can lead to missed opportunities, limiting the benefits AI can provide.

A recent study titled “Engaging with AI: How Interface Design Shapes Human-AI Collaboration in High-Stakes Decision-Making by Zichen Chen, Yunhao Luo, and Misha Sra from the University of California, Santa Barbara, published in January 2025, investigates how different decision-support mechanisms influence user trust, engagement, and decision accuracy. The study introduces six AI interface designs, including explainable AI (XAI) mechanisms and cognitive forcing functions (CFFs), to explore their impact on human-AI collaboration. The findings reveal which AI interface elements improve trust calibration and decision-making, and which inadvertently lead to cognitive overload or reduced performance.

Balancing AI trust and human judgment

In domains like diabetes management, where frequent and informed decision-making is required, AI assistance can help users select optimal dietary choices for blood sugar control. However, blindly trusting AI suggestions can be dangerous, as incorrect recommendations may lead to negative health outcomes. Conversely, dismissing AI advice altogether can result in users missing valuable insights. This study aims to identify strategies that encourage users to engage critically with AI outputs, leading to better decision accuracy and trust calibration.

To explore these dynamics, the researchers conducted a controlled experiment with 108 participants, testing six distinct AI interface mechanisms categorized under Explainable AI (XAI) and Cognitive Forcing Functions (CFFs).

Evaluating AI decision-support mechanisms

The study involved a meal-planning task for diabetes management, where participants had to select the most suitable meal for blood sugar control. The decision-making process was structured in three phases. In the baseline phase, participants made meal choices independently, without AI assistance. In the second phase, they received AI recommendations but without any explanations. In the third phase, AI recommendations were presented along with one of six decision-support mechanisms aimed at improving trust and decision accuracy.

The six mechanisms tested in this phase included textual explanations, where the AI provided written justifications for its recommendations, and visual explanations, which highlighted nutritional information within meal images. Another mechanism involved AI confidence levels, where the AI displayed its certainty in its recommendation as a percentage score. The human feedback mechanism required users to input their own confidence level in their decisions, encouraging self-reflection. Another mechanism, AI-driven questions, prompted users to answer a question before finalizing their choice, forcing them to engage more deeply with the AI’s recommendation. Finally, performance visualization compared the user’s past decisions with the AI’s, offering insights into decision patterns over time.

How AI interface design affects decision-making

Users Over-Rely on AI, Even When It’s Wrong

One of the most significant findings of the study was that users tend to trust AI recommendations blindly, even when they are incorrect. This phenomenon, known as automation bias, was especially prevalent when AI provided recommendations without explanations. When AI made an incorrect prediction, users still accepted the AI’s advice in a significant number of cases, indicating a lack of critical engagement with the decision-support system.

The study highlights the importance of designing AI interfaces that promote active thinking rather than passive compliance. If users are not encouraged to question AI recommendations, they may fail to catch errors, ultimately reducing the effectiveness of human-AI collaboration.

AI Explanation Mechanisms Improve Engagement, But Can Also Increase Cognitive Load

Adding explanation mechanisms to AI systems generally increased user engagement, but in some cases, it also led to higher cognitive load. Text-based explanations helped users understand why AI made a particular recommendation, which improved trust and perceived reliability of the system. However, this also made the interface more complex, increasing mental effort.

Visual explanations, where AI highlighted key nutritional details in meal images, did not significantly improve engagement, suggesting that many users still accepted AI recommendations without closely analyzing them. Confidence levels and performance visualizations were more effective in getting users to assess AI reliability, helping them make better-informed decisions.

Cognitive Forcing Functions (CFFs) Have a Trade-Off Between Trust and Cognitive Effort

Mechanisms designed to force users to think critically about AI recommendations - such as AI-driven questions or requiring users to input their confidence levels - were effective in reducing automation bias, but they also had drawbacks. These mechanisms increased decision-making effort, which, in some cases, led to reduced trust in AI.

For example, AI-driven questions prompted users to reflect on AI recommendations, making them more aware of potential errors. However, this also led some users to question the AI’s accuracy too much, making them hesitant to rely on AI even when it was correct. Performance visualization, which compared user decisions with AI performance over time, was helpful in identifying decision patterns, but it did not significantly increase trust.

Trust Calibration Requires a Balance Between Explainability and Simplicity

The study found that AI explanation mechanisms must strike a balance - providing enough transparency to encourage engagement, but not so much information that it overwhelms users. AI confidence levels and performance visualization mechanisms were particularly effective in helping users calibrate their trust, allowing them to make informed decisions without feeling overloaded with information.

However, some explanation mechanisms - particularly AI-driven questions - led to distrust, as users began to second-guess AI recommendations, even when they were correct. This highlights the importance of tailoring AI design to encourage informed skepticism rather than outright distrust.

Implications for AI design in high-stakes decision-making

The findings of this study have significant implications for the design of AI-powered decision-support systems in critical domains like healthcare, finance, and security. To reduce automation bias, AI interfaces must be designed to actively engage users in decision-making rather than simply presenting recommendations. The level of explainability must be carefully balanced, ensuring that AI provides enough information to justify its recommendations without overwhelming users.

AI confidence levels and performance visualization techniques appear to be the most effective trust calibration mechanisms, as they help users assess AI reliability without significantly increasing cognitive effort. However, excessive decision friction - such as AI-driven questions - may lead to unnecessary skepticism, reducing the effectiveness of AI-assisted decision-making.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback