LLMs can't reason

The video argues that dismissing large language models as incapable of reasoning based on simplistic labels ignores the need for objective, testable criteria to evaluate their abilities, emphasizing that isolated failures do not disprove their reasoning capabilities. It also highlights how emotional biases and inconsistent standards often lead to unfair judgments of AI, urging a rational and balanced approach to understanding AI’s evolving role alongside human creativity.

The video addresses a contentious debate in AI circles: whether large language models (LLMs) can truly reason, think, or understand. The speaker critiques common dismissive arguments that LLMs “can’t reason because they’re just stochastic parrots” or “just next-token predictors.” These arguments, often rooted in fear or misunderstanding, fail to provide concrete tests or criteria for what reasoning entails. The speaker emphasizes that simply labeling something as “just” a particular mechanism does not negate its capabilities, drawing analogies to clocks and sundials that tell time despite being “just” gears or sticks.

To clarify how to assess abilities like reasoning, the speaker proposes the necessity of objective tests. Using examples such as whether humans can run a four-minute mile or whether objects can tell time, he explains that one must observe or design tests to confirm abilities rather than rely on assumptions or dismissive labels. A single successful example can prove an ability exists, while failures do not disprove it. This logic applies to LLMs as well: isolated failures or mistakes do not prove that LLMs cannot reason, just as a slow runner does not prove humans cannot run fast.

The video also explores the emotional and psychological factors influencing people’s reactions to AI capabilities. Many responses to AI’s progress are driven by fear—fear of obsolescence, loss of uniqueness, or diminished human value. This fear leads to irrational arguments and the creation of new, often vague, criteria to deny AI’s abilities. The speaker highlights how people tend to judge AI outputs differently than human outputs, often attributing qualities like “soul” or “depth” exclusively to humans, despite evidence that many people prefer AI-generated art, music, or writing when judged blindly.

Furthermore, the speaker critiques the inconsistent standards applied to AI-generated content, such as art or music, where critics dismiss AI work as soulless or unworthy while accepting similar human outputs without question. This inconsistency stems from a protective instinct over human creativity and identity. The speaker urges viewers to recognize these biases and avoid self-deception, advocating for rational evaluation based on objective evidence rather than emotional reactions or unfounded beliefs.

In conclusion, the speaker defines reasoning as the power of the mind to think, understand, and form judgments through the process of logic. He suggests that the question of whether LLMs can reason should be approached with clear, testable criteria rather than dismissive assumptions. The video calls for a balanced, rational perspective on AI capabilities, warning against both overreaction and denial, and encourages openness to the evolving role of AI in society while acknowledging the unique qualities that human presence and creativity bring.