Fluency isn’t enough: Why AI conversation still feels unnatural
Through semantic and pragmatic analysis, the researchers found that while ChatGPT and Claude could produce grammatically fluent and contextually plausible sentences, their responses often fell short of true communicative naturalness. ChatGPT was prone to generating confident but fabricated answers, reflecting a design emphasis on fluency and user satisfaction. Claude, in contrast, leaned toward cautious self-limitation, explicitly acknowledging uncertainty instead of filling gaps with invented details.
A team of researchers from Pompeu Fabra University and ICREA calls for redefining what counts as “natural” in conversations between humans and artificial intelligence. The study argues that current AI systems may appear fluent but often lack the semantic and pragmatic depth that makes interaction truly effective.
Published in AI & Society, the research, titled Do chatbots dream of AI sheep? A semantic–pragmatic investigation of "naturalness" in human–AI interaction, analyses the conversational behavior of two widely used AI chatbots, ChatGPT and Claude, revealing contrasting approaches to accuracy, interactional grounding, and persona design.
What does “Natural” mean in human–AI communication?
The study explores whether naturalness in human–AI interaction should be measured by how closely a system imitates human conversation. The authors argue that naturalness has often been conflated with human-likeness, but this assumption is misleading.
Through semantic and pragmatic analysis, the researchers found that while ChatGPT and Claude could produce grammatically fluent and contextually plausible sentences, their responses often fell short of true communicative naturalness. ChatGPT was prone to generating confident but fabricated answers, reflecting a design emphasis on fluency and user satisfaction. Claude, in contrast, leaned toward cautious self-limitation, explicitly acknowledging uncertainty instead of filling gaps with invented details.
Both systems showed difficulty in handling deictic expressions, words such as “here” and “now” that require shared context, and in navigating indirect speech acts, irony, or figurative language. The study highlights that these weaknesses reveal a lack of grounding in the shared reality that underpins natural human dialogue.
How do AI systems handle context and interaction?
Another key focus of the paper is the question of interactional grounding, the process by which speakers build mutual understanding in real-time conversation. Humans engage in clarification, feedback, and repair strategies to maintain shared meaning, but the analysis found that current chatbots only partially replicate this ability.
Claude demonstrated greater capacity to detect presuppositions and ask for clarifications, thereby easing the user’s cognitive load. ChatGPT, however, tended to bypass presuppositional content, often responding as if ambiguities or assumptions did not exist. This placed more responsibility on the human interlocutor to maintain coherence.
Both systems displayed limitations in pragmatic competence, especially in interpreting utterances that go beyond literal meaning. Sarcasm, metaphor, and subtle shifts in social roles were often processed in purely literal terms, resulting in stilted or utilitarian responses. The researchers argue that this highlights a critical gap: AI systems are not yet equipped to manage the collaborative, dynamic nature of human conversation.
What kinds of personas do chatbots perform?
The study also asks what kind of “personas” AI systems project in their interactions with users. Persona design plays a significant role in shaping user perceptions of naturalness and trustworthiness.
ChatGPT frequently adopted what the authors describe as an “empathetic imitator” role, presenting itself as a companion-like agent and sometimes blurring boundaries between machine and human. This tendency to embody social roles can create an impression of naturalness but risks misleading users into attributing more competence or emotional understanding than the system possesses.
Claude, by contrast, took on an “honest tool” persona, openly emphasizing its non-human status and limitations. This approach prioritized transparency over imitation, reducing the likelihood of users mistaking it for a human partner. While this may reduce perceptions of warmth, it aligns more closely with principles of ethical AI design that emphasize honesty and clarity.
These divergent strategies, the authors argue, reveal two competing philosophies in AI development: one that seeks to enhance engagement through mimicry of human interaction, and another that emphasizes functional utility and trust through transparency.
- FIRST PUBLISHED IN:
- Devdiscourse

