The video exposes OpenAI’s misleading claim that GPT-5 solved ten open Erdős problems, revealing instead that the AI merely retrieved existing solutions from academic papers, a fact highlighted by experts and competitors like Google DeepMind. It emphasizes the need for skepticism, expert validation, and honest communication in AI advancements to prevent hype and maintain trust, while acknowledging AI’s genuine usefulness in literature search and mathematical research.
The video discusses a recent controversy involving OpenAI’s claim that their GPT-5 model solved ten open Erdős problems, which are famous unsolved mathematical problems posed by the legendary mathematician Paul Erdős. Mark Sulk, an OpenAI researcher, announced on Twitter that he and his colleague had used GPT-5 to solve these problems and made significant progress on others. This announcement was celebrated by OpenAI and used as a recruiting highlight, portraying GPT-5 as a revolutionary breakthrough in mathematics.
However, the excitement was quickly deflated when Deis Saris, the head of Google DeepMind—OpenAI’s main competitor—tweeted simply, “This is embarrassing.” The real issue came to light when Thomas Bloom, the owner and maintainer of Erdősproblems.com, the official database of Erdős problems, clarified that the problems GPT-5 supposedly solved were not actually unsolved. Instead, GPT-5 had merely found existing academic papers that had already solved these problems, papers that Bloom himself had not yet discovered or incorporated into his database. Essentially, GPT-5 acted as a highly effective literature search tool rather than a problem-solving AI.
This revelation highlighted a significant misunderstanding or misrepresentation by OpenAI. Rather than creating new mathematical proofs or solving open problems, GPT-5 simply retrieved existing solutions from academic literature. While this is a valuable capability—helping researchers quickly find relevant papers—it is not the same as solving previously unsolved problems. The video criticizes OpenAI for rushing to publicize these claims on social media without proper verification or consultation with domain experts like Bloom, which led to public embarrassment and skepticism.
The broader issue raised is the hype cycle in the AI industry, where companies often exaggerate their AI’s capabilities to outdo competitors and attract investors. The video stresses the importance of skepticism and critical thinking when evaluating AI claims, urging viewers to distinguish between impressive tools that aid research and genuine breakthroughs that push the boundaries of knowledge. It also underscores the value of peer review and expert validation in preventing premature or misleading announcements.
Finally, the video acknowledges that AI’s ability to perform literature searches is genuinely useful and can accelerate research by saving time. It also notes that AI has made real contributions to mathematics in other areas, such as suggesting new conjectures and automated theorem proving. The key takeaway is that honesty and clarity about what AI can and cannot do are essential to avoid misleading hype and maintain trust in AI advancements.