Everything you need to know about ChatGPT 5.3 Codex

OpenAI’s GPT 5.3 Codex introduces major improvements in coding speed, accuracy, and autonomy, including the ability to adjust instructions mid-task and handle vague prompts, while also excelling in general computer tasks and agent management. The video compares it to Anthropic’s Opus 4.6, highlighting the rapid advancements and competition in AI coding models, and invites viewers to share their preferences.

OpenAI and Anthropic are currently in fierce competition, releasing major updates to their AI models within minutes of each other. Anthropic launched Opus 4.6, while OpenAI released GPT 5.3 Codex. Both companies are heavily investing in advanced coding capabilities, particularly in the area of “genetic coding,” which involves long-horizon tasks, agents, sub-agents, and agent teams. The industry is rapidly moving toward more autonomous and capable AI agents, with both labs pushing the boundaries of what these models can achieve.

One of the main criticisms of previous Codex versions was their slow performance, despite being considered among the best coding models available. GPT 5.3 Codex addresses this by delivering a 25% speed increase, not through faster inference, but by achieving the same results with significantly fewer tokens. For example, in benchmark tests like SweetBench Pro, GPT 5.3 Codex used only 43,000 output tokens compared to 91,000 for version 5.2, resulting in much faster completion times. Additionally, terminal bench accuracy saw a notable improvement, with a 10+ point increase.

A standout feature of GPT 5.3 Codex is its ability to be steered mid-task, allowing users to adjust instructions while the model is working—something not seen in other models. Remarkably, GPT 5.3 Codex was instrumental in its own development, with earlier versions helping to debug, manage deployment, and evaluate test results. This self-improving loop hints at a future where AI models play a significant role in their own advancement, moving closer to autonomous self-improvement.

The model also excels at handling under-specified prompts, making sensible decisions about defaults when users provide vague instructions. In practical tests, GPT 5.3 Codex autonomously built web games and landing pages with minimal human intervention, demonstrating improved intent understanding and aesthetic output compared to previous versions. The video also highlights Grapile, a code review tool that helps manage the increased volume of code generated by these advanced models, ensuring quality and providing confidence scores for code submissions.

Beyond coding, GPT 5.3 Codex is making strides in general computer use and knowledge work, rivaling Anthropic’s Claude in tasks like document creation, spreadsheet analysis, and file manipulation. In OS World benchmarks, which measure a model’s ability to control a computer interface, GPT 5.3 Codex nearly doubled the score of its predecessor. The new Codex app, now available as a downloadable application, makes it easier to manage multiple agents and tasks. The video concludes by inviting viewers to share which model excites them more—Opus 4.6 or Codex 5.3—and encourages engagement with the channel.