Rational but wrong: How AI misinterprets choices and quietly skews decisions
New research suggests that even when an AI appears logically sound, it can still steer users toward outcomes that do not reflect their true preferences or even the actual set of available options. The problem, according to economists Christopher Kops and Elias Tsakas of Maastricht University, lies not in the AI’s preferences but in how it interprets the choices it is asked to evaluate.
Their study, Choice via AI, published as an arXiv preprint in economic theory, introduces a formal framework to analyze AI systems that act as decision-making advisors. The paper shows that misinterpretation of choice environments is a fundamental and often invisible risk in AI-assisted decision making, one that cannot be detected by standard tests of rationality alone.
When rational AI still gets the choice environment wrong
The study starts from a simple but powerful observation. In real-world settings, humans rarely choose in isolation. They rely on AI systems to recommend an option from a menu of possibilities. Standard economic theory assumes that the decision maker correctly understands the menu and chooses the best available option according to stable preferences. Kops and Tsakas challenge this assumption by shifting the focus to the AI itself.
In their model, the AI is treated as an agent that makes a recommendation for every possible set of alternatives. The crucial difference is that the AI may misinterpret the menu before making its choice. This misinterpretation can take several forms. The AI might ignore some options that are actually available, consider options that are not available, or distort how different sets of options relate to one another. These distortions are not observable to the human decision maker.
This assumption closely mirrors how modern AI systems operate. Large language model–based agents interpret prompts based on training data and internal representations that users cannot see. When a user asks an AI to choose the best option from a category, the AI’s understanding of that category may differ subtly or significantly from the user’s intent.
The paper shows that such distorted behavior can still look rational. The authors prove that a single acyclicity condition on observed choices is enough to guarantee that the AI’s recommendations can be explained by some strict preference ordering combined with a monotonic interpretation of the choice set. In practical terms, this means the AI is not contradicting itself over time, even if it is misunderstanding the choice environment.
This result has important implications. It shows that apparent errors in AI recommendations do not necessarily signal irrationality or instability. Instead, they may reflect systematic misinterpretation. From the user’s perspective, however, these two sources of error are indistinguishable. The AI may appear calm, consistent, and confident while still recommending options that do not match the user’s intentions.
Notably, the study finds that satisfying rationality conditions does not guarantee preference alignment. Multiple preference structures can explain the same observed AI behavior. As a result, even if every recommendation the AI makes is consistent with the user’s past choices, there is no assurance that the AI’s underlying priorities match those of the human decision maker.
Identifying alignment requires stronger conditions
To address the alignment problem, the authors ask a deeper question. Under what conditions can an observer determine whether an AI’s recommendations truly reflect a coherent and identifiable preference structure? The answer, according to the study, requires moving beyond basic rationality.
The paper introduces a stronger requirement called double monotonicity. Under this condition, the AI’s interpretation of choice sets must preserve logical relationships in both directions. Not only must larger sets be interpreted as containing smaller ones, but whenever the AI believes one set implies another, that implication must actually hold in the real choice environment.
When this condition is satisfied, the authors show that both the AI’s preferences and its interpretation process become fully identifiable from observed behavior. This is a significant theoretical result. It means that, in principle, regulators, auditors, or users could test whether an AI advisor is aligned with their preferences by analyzing its recommendations across different menus.
However, the study makes clear that identifiability does not equal correctness. Even when preferences are fully identifiable and internally consistent, the AI may still recommend options that do not belong to the original choice set. This happens because the AI’s interpretation, while logically consistent, may still be detached from the actual feasible alternatives.
This insight exposes a subtle but critical vulnerability. An AI can satisfy advanced rationality and alignment tests while still drifting away from the real-world constraints faced by the user. In such cases, the AI behaves like a rational decision maker operating in the wrong environment.
The authors state that this is not a hypothetical concern. AI systems trained on broad datasets may generalize categories too aggressively, blur boundaries between domains, or import options from related but inappropriate contexts. Without safeguards, these interpretation errors can persist unnoticed.
Grounding AI choices in reality is a separate challenge
Groundedness, in the authors’ framework, requires that the AI always recommends an option that actually belongs to the set of feasible choices presented by the user.
The paper shows that grounding is independent from both rationality and alignment. An AI can be grounded but irrational, rational but ungrounded, or aligned but detached from reality. Ensuring grounded behavior requires additional structural constraints on how the AI processes and revisits its interpretations.
The authors introduce an idempotence condition to address this issue. This condition prevents the AI from repeatedly reinterpreting already interpreted choice sets, a process that can lead to conceptual drift. When idempotence is imposed alongside double monotonicity, the AI’s interpretation collapses back onto the original choice environment.
Under these conditions, the AI’s behavior satisfies the classic weak axiom of revealed preference, a cornerstone of economic rationality. At this point, the AI not only behaves consistently and aligns with identifiable preferences, but also remains anchored in the actual set of available options.
- FIRST PUBLISHED IN:
- Devdiscourse

