The video reviews GPT-5 Codex, highlighting its strong coding and refactoring capabilities but criticizing its slow response times, which hinder practical use despite producing higher-quality code. The presenter recommends using the Codex low variant for a better balance of speed and quality while awaiting improvements in the medium and high variants.
The video provides an in-depth review of GPT-5 Codex, an AI model fine-tuned specifically for coding tasks. The presenter acknowledges that while GPT-5 Codex performs well, the practical day-to-day difference compared to other models is not always clear. They express enthusiasm for tuned models like Codex, highlighting its potential especially in code refactoring, an area where many AI models struggle. The presenter also shares insights from exploring the Codex CLI codebase, noting its unique approach to handling shell commands through PowerShell or Bash, which initially seemed broken but later revealed a thoughtful design.
One significant drawback discussed is the speed of GPT-5 Codex, which is described as “so slow” that it becomes almost unusable for practical purposes. The presenter demonstrates this with a simple coding test involving a pool game, where Codex took over seven minutes to complete a task that GPT-5 medium handled in under two and a half minutes. Despite Codex producing better quality code in some cases, the slow response time hampers iterative development and testing, making it frustrating to use in real-world scenarios. The slower speed is attributed to high demand and model complexity, with OpenAI acknowledging the slowdown and working on improvements.
The video also compares different Codex model variants (low, medium, high) and their performance on coding tasks. Interestingly, the low variant sometimes outperformed the medium and high variants in terms of physics simulation accuracy in the pool game example. The presenter notes that while Codex models generally produce better code, the speed trade-off is significant. They recommend using Codex low for now, as it offers a good balance between speed and quality, while waiting for improvements in the medium and high variants.
Refactoring capabilities are explored through a case study where the presenter tasked GPT-5 Codex and Claude (another AI model) with refactoring a monolithic file. Codex produced a reasonable refactor with four new composables and two components, and the code worked well without issues. Claude’s plan was more aggressive, resulting in a larger reduction in code size and more components, but the output was broken and required significant fixes. This highlights Codex’s strength in producing functional, reliable code even if less ambitious, compared to other models that may overreach but fail to deliver working results.
In conclusion, the presenter appreciates the advancements in GPT-5 Codex and its potential for coding tasks, especially refactoring, but emphasizes that the current speed limitations make it difficult to fully adopt for daily use. They encourage viewers to consider the trade-offs between speed, quality, and cost when choosing models and recommend sticking with Codex low for now. The video ends with an invitation for viewers to share their experiences and thoughts on GPT-5 Codex, acknowledging that the presenter’s perspective is one of many and that ongoing improvements could change the landscape soon.