Why calling personalized AI a ‘digital twin’ misleads users and policymakers


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 02-02-2026 09:23 IST | Created: 02-02-2026 09:23 IST
Why calling personalized AI a ‘digital twin’ misleads users and policymakers
Representative Image. Credit: ChatGPT

The idea that artificial intelligence (AI) can function as a digital replica of a human being is gaining momentum across both industry and academia. Researchers now warn that this framing is not only misleading but potentially harmful.

The warning comes from a peer-reviewed study titled Personalised LLMs and the Risks of the Digital Twin Metaphor, published in AI & Society. The authors argue that personalized large language models (LLMs) fail to meet even minimal standards required for genuine replication of human identity, while the metaphor itself amplifies ethical risks.

From engineering precision to human imitation claims

The concept of a digital twin has a clear and narrow origin. In engineering and industrial design, digital twins are computational models that mirror physical systems such as aircraft engines, manufacturing plants, or medical devices. These models rely on continuous streams of sensor data and measurable parameters, allowing engineers to predict failures, test scenarios, and optimize performance. Accuracy, real‑time synchronization, and empirical validation are central to the concept.

According to the study, problems emerge when this metaphor is applied to people. In recent years, personalised AI systems trained on emails, messages, social media posts, and other digital traces have been described as digital twins of individuals. Some companies promote these systems as tools that preserve a person’s essence, personality, or values. In healthcare research, similar models are proposed as psychological or ethical twins that could speak on a patient’s behalf when they are incapacitated. In bereavement technology, so‑called griefbots promise continued interaction with deceased loved ones.

The authors argue that this shift represents a fundamental break from the original meaning of a digital twin. Unlike machines, humans are not systems defined by stable, measurable parameters. Identity is shaped by memory, context, emotion, embodiment, and lived experience, much of which is never captured in data. By borrowing the authority of an engineering metaphor without meeting its conditions, the study finds that personalised AI systems are being oversold as replicas rather than approximations.

The research traces how this metaphor gained momentum with little critical examination. Once established, it has been repeated across academic papers, startup marketing, and public discourse, reinforcing the idea that human identity can be faithfully modelled through data alone. The authors stress that this rhetorical shift is not neutral. It frames expectations, guides design choices, and influences how users relate to these systems.

Why current AI systems fail the digital twin test

The paper systematically evaluates whether personalised LLMs can plausibly qualify as digital twins. To do this, the authors outline three increasingly demanding interpretations of what such a twin would require.

The first is a behaviour‑based interpretation. Under this view, an AI system might count as a twin if it consistently produces responses that closely resemble what a specific person would say in similar situations. Even at this minimal level, the study finds that current systems fall short. Empirical evidence shows that fine‑tuned language models trained on an individual’s writing style still produce generic answers, factual errors, or responses that diverge noticeably from the real person. Gaps in training data, hallucinated details, and reliance on generalized patterns prevent reliable imitation.

The second interpretation is representational. Here, a digital twin would not only speak like a person but would also share access to the underlying information that shapes their identity. This includes memories, beliefs, values, and the ability to apply them consistently across different domains of life. The authors argue that personalised LLMs cannot meet this standard because the data used to train them captures only fragments of a person’s experience. Text data excludes tone, gesture, emotional state, and unspoken reasoning. More importantly, it cannot encode the causal structure that explains why someone holds certain views or makes specific choices.

Human agency, the study notes, is unified across contexts. People do not operate as separate versions of themselves for work, family, and moral decision‑making. By contrast, current AI systems struggle to maintain coherence even within a single conversational thread, let alone across domains. Attempts to simulate unified agency often rely on multiple model instances stitched together, creating the appearance of continuity without its substance.

The third and most demanding interpretation is phenomenal. Under this view, a true digital twin would need to share aspects of a person’s inner experience, including conscious awareness and emotional understanding. The authors are unequivocal in rejecting this possibility. Large language models generate text through statistical prediction, not experience or intention. They do not feel, understand, or possess a first‑person perspective. Any suggestion otherwise, the study argues, reflects confusion between appearance and reality, often reinforced by science fiction narratives rather than technical facts.

Across all three interpretations, the conclusion is the same. Personalised LLMs may mimic patterns in language, but they do not replicate identity. The digital twin metaphor, the authors write, collapses critical distinctions between simulation and personhood.

Ethical risks for healthcare, law, and human relationships

The study majorly focuses on the ethical consequences of mislabeling AI systems as digital twins. Metaphors shape understanding, and when they misrepresent technology, they can lead to serious harm.

In healthcare, proposals for AI systems that predict patient preferences are often framed as creating psychological twins capable of guiding treatment decisions. The authors warn that this framing may encourage clinicians or family members to defer to algorithmic outputs as if they carried the authority of the patient themselves. When systems are presented as speaking for a person, rather than offering probabilistic insights, the risk of misplaced trust increases sharply.

Legal contexts present similar dangers. Ideas around posthumous digital testimony or AI‑based representation of deceased individuals rely heavily on the twin metaphor. The study argues that this risks blurring the line between evidence and simulation, undermining principles of accountability and due process.

The most acute concerns arise in the context of grief and companionship technologies. Systems marketed as preserving the essence of loved ones may foster emotional dependence, delay acceptance of loss, or expose users to renewed trauma if services change or shut down. By encouraging users to perceive AI systems as conscious or emotionally present, the digital twin metaphor exploits deeply human tendencies toward attachment and anthropomorphism.

At a societal level, the authors caution that misleading metaphors can distort public debate and policy formation. Regulatory discussions around digital personhood, consent, and posthumous rights may proceed from false assumptions about AI capabilities. Once embedded in law or institutional practice, these assumptions are difficult to reverse.

The study frames this as a form of epistemic harm. When people lack accurate concepts to understand technology, their ability to make informed decisions is undermined. This harm is likely to fall most heavily on individuals with limited technical literacy, widening existing digital inequalities.

Rethinking how personalised AI should be described

Rather than calling for a ban on personalised AI systems, the authors argue for conceptual clarity. They acknowledge that such systems can be useful tools for organizing information, supporting reflection, or enhancing access to services. The problem lies not in their existence but in how they are framed.

The study suggests alternative ways of describing personalised LLMs that better reflect their nature. One approach is to view them as role‑playing systems that perform linguistic imitation without claiming identity or consciousness. Another is to describe them as statistical pattern engines that generate plausible text based on learned regularities. While less emotionally compelling, these labels reduce the risk of deception and encourage more realistic expectations.

The paper notes that language choices are not cosmetic. They shape design priorities, user relationships, and governance frameworks. As AI systems move deeper into personal and institutional life, the demand for accuracy over allure becomes an ethical obligation.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback