New AI model mimics human thinking across domains, outperforms cognitive theories

In rigorous tests, Centaur consistently outperformed traditional cognitive models, including Prospect Theory and reinforcement learning frameworks. On held-out participant data, it yielded significantly lower negative log-likelihoods, indicating tighter alignment with actual human behavior. In nearly every test across the 160 experiments, Centaur emerged as the superior predictor.

CO-EDP, VisionRI | Updated: 04-07-2025 09:19 IST | Created: 04-07-2025 09:19 IST

New AI model mimics human thinking across domains, outperforms cognitive theories — Representative Image. Credit: ChatGPT

Researchers have developed a new AI model called Centaur that can predict human decisions across dozens of psychological tasks - a breakthrough that brings humans closer to a long-standing goal in psychology.

In a study published in Nature titled “A foundation model to predict and capture human cognition,” researchers introduced Centaur, a foundation model fine-tuned on large-scale human behavioral data. Built on Meta AI’s Llama 3.1 70B large language model, the new model sets a new standard in predicting human decision-making across diverse experimental domains.

Developed using data from more than 10 million human choices in 160 psychological experiments involving over 60,000 participants, Centaur not only outperformed domain-specific cognitive models but also demonstrated robust generalization across novel scenarios.

Can a single model predict human behavior across domains?

The study aimed to build a model that can predict human decisions in any experimental setting expressed in natural language. For this, researchers curated a massive dataset called Psych-101, encompassing trial-by-trial data from experiments ranging from memory tests and multi-armed bandits to complex decision-making and learning tasks.

Centaur was trained using quantized low-rank adaptation (QLoRA), a parameter-efficient fine-tuning method that adjusted only a small portion of the model’s parameters. Despite this limited modification, just 0.15% of Llama’s weights, Centaur demonstrated remarkable gains in accuracy.

In rigorous tests, Centaur consistently outperformed traditional cognitive models, including Prospect Theory and reinforcement learning frameworks. On held-out participant data, it yielded significantly lower negative log-likelihoods, indicating tighter alignment with actual human behavior. In nearly every test across the 160 experiments, Centaur emerged as the superior predictor.

Importantly, Centaur didn’t just match averages, it reproduced the full distribution of participant trajectories. In tasks such as the two-step paradigm, it reflected not only model-free and model-based strategies but also their combinations, just as observed in real human populations.

How well does Centaur generalize to unseen scenarios?

One of the most critical benchmarks for a foundation model is its ability to generalize beyond its training data. Centaur met this challenge head-on. Researchers tested Centaur in three progressively difficult out-of-distribution conditions: modified cover stories, altered task structures, and entirely new domains. In each case, it retained predictive superiority:

In a reframed version of the two-step decision task, where the original story of “spaceships” was replaced with “magic carpets,” Centaur maintained high accuracy, even though the new narrative was absent from training.
In a structurally modified experiment called Maggie’s Farm, which introduced a third choice option in a multi-armed bandit setting, Centaur again outperformed both the base model and cognitive models, showing resilience to changes in task complexity.
In a logical reasoning test based on LSAT-style items, a domain entirely excluded from its training set, Centaur achieved strong predictive performance, confirming its ability to generalize far beyond familiar patterns.

Further evaluations on moral decision-making, economic games, and naturalistic learning tasks reinforced this finding. Across six additional out-of-distribution settings, Centaur retained robust performance while smaller or non-fine-tuned models faltered.

Can this model also reveal how the human brain thinks?

Centaur’s alignment with human behavior is not only statistical but also neurological. The researchers conducted a novel set of tests to examine how well Centaur’s internal representations correlate with brain activity.

In two neuroimaging studies, participants completed decision-making and sentence-reading tasks while undergoing fMRI scans. Centaur’s hidden layer activations, without being explicitly trained for neural tasks, showed significantly stronger correlations with brain activity than the base Llama model. This included accurate decoding of activation in brain regions like the left motor cortex, accumbens, and medial prefrontal cortex, areas known for decision-making and reward processing.

Centaur also conformed to behavioral regularities such as Hick’s Law, which links decision time to response entropy. By modeling nearly 4 million human response times, Centaur outperformed both Llama and cognitive baselines in predicting reaction speed, showing that it captures not just what people choose, but how long they take to decide.

The research extended further, using Centaur to guide scientific discovery. When paired with DeepSeek-R1, another AI system, Centaur facilitated the creation of a new model for multi-attribute decision-making that better matched human choices than traditional strategies. Even when DeepSeek-R1 proposed a novel heuristic, Centaur’s unmatched predictive power helped refine that model into a hybrid strategy, ultimately delivering a solution that balanced accuracy and interpretability.

Implications: Toward a Unified Theory of Mind

This breakthrough opens pathways for automated cognitive modeling, allowing scientists to simulate human behavior at scale without crafting problem-specific models. It can be used for in silico experiments, hypothesis generation, and personalized behavioral prediction.

The researchers also emphasize that Centaur could guide the next generation of neuroscience-informed AI, potentially helping to determine which architectural principles, such as attention mechanisms or vector-based memory, best capture the human mind.

Although the Psych-101 dataset currently focuses on decision-making and learning, plans are in place to expand into psycholinguistics, social cognition, and cross-cultural psychology, addressing existing limitations such as its bias toward WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations.

FIRST PUBLISHED IN:
Devdiscourse

New AI model mimics human thinking across domains, outperforms cognitive theories

Can a single model predict human behavior across domains?

How well does Centaur generalize to unseen scenarios?

Can this model also reveal how the human brain thinks?

Implications: Toward a Unified Theory of Mind

TRENDING

U.S. Coast Guard Pursues Sanctioned Oil Tanker Amid Geopolitical Tension

Yen's Tumble: BOJ's Rate Hike Stirs Forex Markets

Najib Razak's Fate Hangs in Balance Amid 1MDB Corruption Scandal

Australia's Natural Gas Reservation Proposal: A New Policy to Secure Domesti...

OPINION / BLOG / INTERVIEW

Data centres can go green with biomass, water retention and clean power

Sustainable AI remains possible but only with strong governance and regulation

IT governance boosts sustainability only through digital financial transformation

EU AI Act risks failure without strong enforcement capacity

DevShots

Latest News

Trump's Oil Dilemma: Keep or Sell?

Uganda's Strategic Play Against Tunisia's Unbeaten Streak at Africa Cup of Nations

Trump Tackles Defense Contractor Inefficiencies: Executive Order in the Works

US Peace Deal Proposals Gain Traction Amid Increased Tensions

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT