AGI progress is an illusion without clear definitions and benchmarks
The central tension in defining AGI lies in its dependence on human intelligence as both a target and a benchmark. As the paper notes, this dual role is problematic because it collapses the distinction between what AI is and what it ought to be. When the human mind becomes the gold standard, intelligence is judged by how well machines conform to human-like behavior. But this anthropocentric view, according to Landers, limits our capacity to recognize diverse forms of intelligence that might emerge from non-human systems.
A global arms race is unfolding - not in weaponry, but in intelligence. At the heart of this contest lies the ambition to develop artificial general intelligence (AGI), a form of machine intelligence that can perform any intellectual task a human can, or perhaps even better. Yet for all the sweeping claims and exponential growth in computational power, the conceptual and operational path toward AGI remains as nebulous as ever.
A recent peer-reviewed study titled “Modern Prometheus: Tracing the Ill-Defined Path to AGI”, published in AI & Society, delivers a sobering examination of the historical, philosophical, and technological complexities surrounding the term AGI and urges a fundamental rethinking of how we define, measure, and pursue it.
The paper, authored by Matthew Landers of Western Norway University of Applied Sciences, argues that the central challenge in AGI research is not computational but definitional. Through a deep dive into three historical paradigms, strong AI, human-level AI, and AGI, the paper reveals how conceptual confusion and anthropocentric assumptions continue to haunt AI development. Rather than converging on a shared understanding, the field has fractured into competing visions, each with its own ontological commitments and operational goals.
What does AGI really mean?
The study begins with a forensic analysis of how AGI entered public discourse. While early figures like Alan Turing, John McCarthy, and John Searle shaped initial discussions around machine intelligence, the modern concept of AGI was significantly popularized by Ben Goertzel and others in the mid-2000s. Goertzel’s framing emphasized a form of generalized intelligence that could learn and adapt across domains - far beyond narrow AI systems designed for specific tasks like playing chess or diagnosing medical conditions.
Yet, as the paper details, there is no consensus on what AGI should look like, how it should operate, or what benchmarks should be used to evaluate it. For some, like Goertzel, AGI means emulating human generalization abilities. For others, such as François Chollet and Pei Wang, generality and adaptability are conceptual metrics that need to be untethered from human intelligence entirely. These differing assumptions lead to contradictory research trajectories, undermining any clear framework for measuring progress toward AGI.
The confusion has real-world implications. In February 2024, Elon Musk filed a lawsuit against OpenAI, alleging the company had breached its founding agreement by commercializing what he believed to be AGI-level systems, specifically GPT-4. Microsoft researchers had described GPT-4 as a “significant step” toward AGI, suggesting that its abilities might cross a critical threshold. If true, Musk argued, the model should not have been licensed under the original agreement, which explicitly excluded AGI from commercial use. The lawsuit, since refiled in California, signals the beginning of a legal battle over how AGI should be defined and regulated - a conflict that underscores the stakes of remaining conceptually vague.
How should AGI be measured?
The central tension in defining AGI lies in its dependence on human intelligence as both a target and a benchmark. As the paper notes, this dual role is problematic because it collapses the distinction between what AI is and what it ought to be. When the human mind becomes the gold standard, intelligence is judged by how well machines conform to human-like behavior. But this anthropocentric view, according to Landers, limits our capacity to recognize diverse forms of intelligence that might emerge from non-human systems.
To address this issue, the study introduces a taxonomy that distinguishes between two ontological orientations: Promethean and Noctuidean. The Promethean path seeks to replicate human cognition, adhering to the idea that human-like intelligence is the only meaningful kind. This approach dominates much of today’s AI research and aligns with the ambitions of OpenAI, DeepMind, and similar firms. In contrast, the Noctuidean orientation emphasizes intelligence as an adaptive, non-anthropomorphic capability that could take radically different forms. Companies like Anthropic, for instance, focus on building steerable, interpretable, and safe systems without necessarily replicating the human mind.
Landers argues that the field must move beyond the binary framework of “human-like” vs. “not intelligent.” Instead, intelligence should be seen as a gradient encompassing diverse cognitive architectures. In this light, AGI becomes not a monolithic destination but a continuum of capabilities. The study supports this view with the work of scholars like Michael Levin, who promotes a framework of “diverse intelligences,” and researchers at DeepMind, who have proposed a two-dimensional benchmark matrix measuring performance and generality levels separately.
Such frameworks attempt to decouple skill from intelligence. A machine that performs a task with superhuman proficiency through brute computational power may not be intelligent in any meaningful sense if it lacks the capacity to generalize or adapt to new environments. This distinction becomes especially relevant when evaluating large language models (LLMs) like GPT-4, which demonstrate emergent behaviors but operate largely as opaque “black boxes.” Interpretability remains elusive, further complicating the case for labeling such systems as AGI.
What future are we choosing by default?
Ultimately, the paper warns that the lack of definitional clarity around AGI could be weaponized. In the absence of agreed-upon standards, companies and governments are free to define AGI in ways that suit immediate commercial or political interests. This flexibility could be used to justify risky deployments, avoid regulation, or attract investment under the guise of frontier achievement. Such ambiguity raises ethical red flags, particularly as these technologies become embedded in critical infrastructure, healthcare, and education.
The financial stakes are staggering. OpenAI’s “Stargate” project, estimated to cost over $500 billion, is only one example of the massive capital flowing into AGI research. A recent McKinsey report projects that AI-driven demand for data centers will triple by 2030, with AGI-like models consuming most of the resources. Yet, with so much money on the line, few are stopping to ask: to what end?
- FIRST PUBLISHED IN:
- Devdiscourse

