The video compares three cloud-based AI coding agents—Claude Code, Cursor, and OpenAI’s Codex—by testing their workflows, user interfaces, and performance on a real-world coding task, highlighting differences in speed, integration, and pull request handling. The presenter ultimately favors Cursor for its faster execution, seamless local integration, and polished output, while noting that each tool offers unique advantages depending on the user’s needs.
The video explores the use of cloud-based AI coding agents for running complex coding tasks remotely, focusing on three major players: Claude Code, Cursor background agents, and OpenAI’s Codeex. While all three tools share the basic mechanics of spinning up a cloud environment, connecting to a GitHub repository, creating a branch, and submitting a pull request, the key differences lie in their workflows and user interfaces. The presenter emphasizes that the workflow experience—how tasks are initiated, monitored, and reviewed, especially on mobile devices—is crucial in deciding whether to adopt these cloud agents or stick with local setups.
To test these tools, the presenter uses a real-world example: updating the workshops section of his Builder Methods website. The goal is to improve the layout by adding thumbnails, filtering options for upcoming and past workshops, and reusing an existing grid and list view toggle interface. He drafts a detailed prompt outlining the requirements and runs the task on all three platforms to compare their performance and user experience. This practical approach highlights how each tool handles a moderately complex feature update in a real project.
Starting with Claude Code on the web, the experience closely mirrors the Cloud Code CLI, providing a consistent and familiar interface. The presenter appreciates the live updates visible on both desktop and mobile via the Claude mobile app, which syncs seamlessly. Claude Code allows model selection, with the presenter opting for the Sonnet 4.5 model for this larger task. However, Claude Code does not automatically create a pull request, which the presenter finds a bit inconvenient. A notable feature is the “open in CLI” button that copies a command to pull the entire session history and check out the branch locally, facilitating easy review and testing.
Cursor’s background agents impress with their speed and integration. Using the Composer 1 model, the presenter notes that Cursor completes the task significantly faster than the others. Cursor’s web interface is fully responsive and usable on mobile browsers, though it lacks a dedicated mobile app. A standout feature is the seamless integration between the web interface and the Cursor IDE, allowing users to open the cloud agent’s work directly in their local environment. Cursor also automatically creates a pull request, streamlining the review process. The presenter finds the UI components generated by Cursor, such as the toggle switch and tag filtering, to be more polished and functional compared to Claude Code’s output.
Codeex, linked through the ChatGPT account, requires connecting both a GitHub repository and an environment, which initially caused some confusion. Unlike Cursor, Codeex does not automatically create pull requests, requiring manual action to do so. The presenter assumes Codeex uses a GPT-5 based model but notes the absence of a model selector. Codeex supports running multiple versions of agents, similar to Cursor. After completion, the presenter encounters a minor issue in the generated code, which he quickly fixes using Cursor’s Composer model. Overall, while all three tools successfully completed the task, the presenter favors Cursor for its speed, integration, and quality of output, though he acknowledges that each tool has its strengths depending on the use case.