Why is AI so smart but also so dumb?

Andre Karpathy explains that AI’s impressive yet inconsistent intelligence stems from its reliance on verifiable tasks and training incentives, enabling breakthroughs in areas like coding while struggling with simpler, less verifiable problems. He envisions a future where AI agents autonomously manage digital environments, transforming software development and human interaction, but emphasizes that human oversight, judgment, and creativity remain essential.

The video features Andre Karpathy, a leading AI expert and former Tesla self-driving inventor, discussing why AI can be both remarkably smart and surprisingly dumb. He highlights a pivotal moment in December when AI models, especially large language models (LLMs), dramatically improved, enabling end-to-end coding and application building without constant human correction. This shift marks a new computing paradigm—software 3.0—where programming transitions from writing explicit code to training models with data and steering them through prompts and context windows. Karpathy emphasizes that AI now acts like a programmable computer, interpreting instructions to achieve outcomes rather than following rigid step-by-step commands.

Karpathy explains the concept of verifiability as key to understanding AI’s uneven performance. Traditional software excels where tasks can be explicitly specified, while LLMs thrive in domains where outputs can be verified, such as coding and math. These areas allow for automated feedback and reinforcement learning, which AI labs heavily optimize for due to their economic value. This focus explains why AI can flawlessly refactor complex codebases yet struggle with seemingly simple tasks like counting letters or making everyday decisions. The jaggedness in AI’s abilities reflects both training incentives and the nature of verifiable tasks, indicating that we have not yet reached generalized intelligence.

The discussion also contrasts “vibe coding” and “agentic engineering.” Vibe coding lowers the barrier for anyone to create software by leveraging AI to handle coding details, raising the floor of what’s possible for non-experts. Agentic engineering, on the other hand, raises the ceiling for professional developers by orchestrating multiple AI agents to maintain high-quality, scalable software development. This new discipline requires skills beyond traditional coding, including managing AI agents’ workflows and ensuring software quality, highlighting a shift in how software engineering will evolve alongside AI.

Karpathy addresses concerns about AI replacing human jobs by emphasizing that humans remain essential for oversight, taste, and judgment. While AI can automate many tasks, it currently lacks intrinsic motivation, curiosity, or genuine understanding—qualities shaped by evolution in animals but absent in AI, which he describes as “ghosts” rather than sentient beings. He acknowledges that AI labs have not yet focused on training models to develop aesthetic judgment or taste, leaving room for human creativity and decision-making to remain vital, at least for the foreseeable future.

Finally, Karpathy envisions a future where AI agents autonomously interact and manage digital environments on behalf of humans, requiring a complete redesign of internet services and software infrastructure to be “agent-first.” He stresses the importance of simplifying human interaction with AI by minimizing manual steps and enabling seamless agent orchestration. Despite AI’s growing capabilities, he underscores that humans must still direct and understand AI’s outputs to avoid becoming bottlenecks, as true understanding cannot be outsourced. This balance between AI automation and human insight will shape the evolving landscape of technology and work.