Google DeepMind has proposed a new cognitive framework to measure progress toward Artificial General Intelligence (AGI) by evaluating AI systems across ten distinct cognitive faculties compared directly to human performance. This approach, supported by a $200,000 Kaggle hackathon, aims to provide a more objective and comprehensive assessment of AI capabilities, addressing limitations of previous benchmarks and advancing the understanding of what constitutes true AGI.
On March 16th, 2026, Google DeepMind introduced a groundbreaking paper proposing a new way to measure progress toward Artificial General Intelligence (AGI). Instead of relying on vague benchmarks or single scores, the paper suggests a cognitive framework that evaluates AI systems across ten distinct cognitive faculties, directly comparing their performance to that of humans. This approach aims to resolve the ongoing debate about what AGI truly means by breaking intelligence down into measurable components such as perception, generation, attention, learning, memory, reasoning, meta-cognition, executive functions, problem-solving, and social cognition.
The paper emphasizes that these ten faculties are grounded in decades of research from psychology, neuroscience, and cognitive science, reflecting how human intelligence is studied. Each faculty represents a critical aspect of cognition, from basic sensory perception to complex social understanding. Importantly, the framework focuses on what AI systems can accomplish rather than the specific technologies they use, whether transformers, diffusion models, or other architectures. This shift allows for a more objective and comprehensive assessment of AI capabilities.
To test AI systems against these faculties, the paper outlines a three-stage evaluation process. First, AI undergoes a cognitive assessment through targeted tasks designed to isolate each cognitive ability, with strict controls to prevent data contamination. Second, human baselines are established by administering the same tasks to diverse groups of adults, providing a real-world performance distribution for comparison. Finally, cognitive profiles are created by plotting AI performance against human results, producing radar charts that visually highlight strengths and weaknesses across the ten faculties.
While the framework marks a significant advance, the paper acknowledges limitations. It does not measure response speed, which is crucial for real-world applications like self-driving cars or coding assistants. It also does not capture behavioral tendencies such as risk aversion or alignment with human values, nor does it directly assess creativity, which is instead inferred through related cognitive processes. Additionally, evaluating AI systems as integrated entities—including their use of tools—poses challenges in distinguishing raw intelligence from tool-assisted performance.
Google is actively supporting this initiative by launching a $200,000 Kaggle hackathon to develop evaluations targeting the most challenging faculties: learning, meta-cognition, attention, executive functions, and social cognition. This effort aims to create robust, standardized tests that can track AI progress more scientifically. As AI development accelerates, this cognitive framework offers a clearer, more nuanced way to understand and measure the jagged, uneven capabilities of current AI systems, moving the field closer to defining and recognizing true AGI.