OpenAI's mystery models are insane

artesia · 22 July 2025 16:57

OpenAI’s new mystery model “03 Alpha” showcased extraordinary zero-shot coding skills by securing second place in the ATCoder World Tour Finals 2025, while another experimental reasoning model won a gold medal at the International Math Olympiad under strict human-like conditions. These achievements highlight OpenAI’s innovative use of large-scale computation and language models as judges to push AI’s coding and reasoning capabilities, exemplifying the “bitter lesson” of scaling over hand-coded knowledge, with GPT-5 teased for future release.

artesia · 22 July 2025 17:17

The video discusses the emergence of OpenAI’s new mystery model, dubbed “03 Alpha,” which has demonstrated remarkable coding abilities. This model recently secured second place in one of the world’s most challenging coding competitions, the ATCoder World Tour Finals 2025, held in Tokyo. Despite a human programmer named Psycho, a former OpenAI employee, winning the contest after a grueling 10-hour marathon, 03 Alpha’s performance was impressive enough to claim the runner-up spot. Examples of the model’s coding prowess include creating polished games like Space Invaders, a basketball shooting game in space, a 3D Pokedex, and even a version of Doom, showcasing its advanced zero-shot coding capabilities.

In addition to 03 Alpha’s coding achievements, OpenAI has developed another experimental reasoning model that recently won a gold medal at the International Math Olympiad (IMO), a prestigious and notoriously difficult math competition. This model was evaluated under strict conditions mimicking human contestants, including two 4.5-hour exam sessions without any external tools or internet access. The significance of this achievement lies in the model’s ability to solve complex, long-horizon math problems that require sustained creative thinking and intricate, multi-page natural language proofs, marking a new frontier in AI reasoning capabilities.

The video highlights the challenges of verifying IMO solutions, which are not easily checked programmatically due to their complexity and length. OpenAI’s approach to overcoming this involves moving beyond traditional reinforcement learning paradigms that rely on clear-cut, verifiable rewards. Instead, they explore using large language models themselves as judges to evaluate the quality of proofs generated by other models. This innovative method allows scaling up training even when direct verification is difficult, pushing the boundaries of AI’s reasoning and problem-solving abilities.

A key theme in the video is the “bitter lesson,” an AI research concept emphasizing that the most significant advances come from scaling up computation and removing humans from the loop rather than relying on hand-coded knowledge or rules. Examples include the evolution of chess AI from rule-based systems to self-play learning and Tesla’s shift from hardcoded driving rules to end-to-end neural networks. OpenAI’s recent breakthroughs in coding and math models exemplify this lesson, as they leverage general-purpose reinforcement learning and massive compute scaling to achieve superhuman performance.

Finally, the video teases the upcoming release of GPT-5, while clarifying that the experimental IMO gold medal model is separate and will not be publicly available for several months. The rapid progress in AI capabilities, demonstrated by these new models excelling in some of the toughest coding and math competitions globally, suggests an accelerating pace of AI development. The presenter expresses excitement about these advancements and encourages viewers to like and subscribe for more updates on this evolving frontier.