The current state of gpt-5

The video explains that GPT-5 is a powerful and capable model, especially in coding and complex reasoning, but inconsistent user experiences caused by poor interfaces and rollout issues have led to misunderstandings and disappointment. Despite these challenges, GPT-5 excels as a collaborative tool for sustained problem-solving, outperforming other models in precision and reliability, with promising potential for future improvements and broader accessibility.

The video provides an in-depth and transparent analysis of the current state of GPT-5, addressing widespread questions and misconceptions about the model. The creator explains that while GPT-5 is an incredible and powerful model that they use daily, the experience of using it has been inconsistent and sometimes frustrating. They clarify that the version they tested early on, often referred to as the “reasoning alpha” snapshot with a custom system prompt used in the Cursor app, is essentially the same underlying model available to the public via the API. However, differences in implementation, system prompts, and user interfaces across platforms have led to varying user experiences, contributing to confusion and disappointment.

A significant part of the problem stems from the layers through which users access GPT-5, such as chat.openai.com, Cursor, and other third-party apps, rather than directly via the API. These layers often introduce bugs, poor UI feedback, and throttling issues that degrade the perceived quality of the model. For example, the auto-router system initially routed many users to a less capable version of GPT-5, causing widespread dissatisfaction. Additionally, Cursor’s implementation has UX flaws like lack of progress indicators during long reasoning tasks, making the model feel slow or frozen. Despite these issues, the core GPT-5 model remains highly capable, especially in coding and complex reasoning tasks.

The video also contrasts GPT-5 with other models like Grock 4 and Claude, highlighting GPT-5’s strengths in deliberate, step-by-step problem solving and its tendency to follow instructions precisely without hallucinating as much. While Grock 4 is very smart, it tends to overthink and hallucinate tool calls, making it less practical for some tasks. Claude, on the other hand, is better for conversational and empathetic interactions but less reliable for complex problem-solving. GPT-5 strikes a balance by being a strong “coworker” model that can assist in coding and reasoning without veering off into irrelevant or fabricated answers.

The creator shares personal experiences using GPT-5 for challenging puzzles at Defcon, demonstrating how the model helped break down complex problems and generate useful code, even if it did not always produce final answers. This practical use case underscores GPT-5’s ability to collaborate effectively with humans on difficult tasks, marking a significant step forward in AI capabilities. The video also references a detailed blog post analyzing the botched rollout of GPT-5, noting that the poor initial user experience and miscommunication have led to an underappreciation of the model’s true advancements.

In conclusion, the video emphasizes that GPT-5 is a refined and powerful model that excels in sustained, complex work rather than just raw intelligence or speed. The rollout issues and poor user interfaces have unfairly tarnished its reputation. The creator remains optimistic about GPT-5’s potential and continues to use it extensively, while also encouraging exploration of smaller, cost-effective variants like GPT-5 Mini and Nano. They call for better infrastructure and tooling around these models to unlock their full potential and warn against underestimating AI progress due to superficial early impressions.