Gemini 3.1 vs Claude Opus 4.6 - Test on real code

artesia · 20 February 2026 15:32

The video compares Gemini 3.1 Pro and Claude Opus 4.6 on real coding tasks, finding that Claude is significantly faster, more reliable, and easier to use, especially for building features and handling iterative improvements. The creator concludes that Claude Opus 4.6 is the clear winner due to its superior performance and cost-effectiveness, while Gemini 3.1 is described as slow, unreliable, and disappointing.

artesia · 20 February 2026 16:34

The video compares Gemini 3.1 Pro and Claude Opus 4.6, two advanced AI models, by testing them on real coding tasks. The creator sets up a side-by-side experiment using OpenRouter and RU Code, since Gemini 3.1 isn’t available via the Gemini CLI. The test involves implementing a new feature for the creator’s SaaS YouTuber system, which automates YouTube workflows. The feature aims to make competitor analysis more transparent by displaying competitor thumbnails and titles before generating suggestions.

The creator runs the same prompt through both Claude Opus 4.6 and Gemini 3.1, instructing each to build the feature and open the result on different ports. Additionally, a simpler coding challenge—building a one-page flight simulator—is given to both models in their respective browser interfaces. The creator notes the high cost of API usage, especially with Gemini, and expresses a preference for Claude’s more cost-effective subscription model.

During the tests, Claude Opus 4.6 consistently outperforms Gemini 3.1 in both speed and reliability. Claude completes the flight simulator task quickly and allows for iterative improvements, while Gemini is much slower and struggles with execution. The creator highlights the superior graphics and responsiveness of Claude’s output, and finds it easier to interact with and debug.

When testing the SaaS feature, Claude manages to pull in competitor thumbnails and titles, allowing the user to select inspirations and generate packaging concepts, though some minor bugs are encountered. Gemini, on the other hand, repeatedly fails due to configuration errors and issues with environment variables, despite multiple attempts to fix the problems. The creator grows increasingly frustrated with Gemini’s lack of progress and reliability.

In conclusion, the creator finds Claude Opus 4.6 to be the clear winner, citing its speed, ease of use, and cost-effectiveness. Gemini 3.1 is described as underwhelming, unreliable, and potentially overwhelmed by demand. The video ends with the creator expressing disappointment in Gemini’s performance and reaffirming a preference for Claude, while inviting viewers to share their own experiences with the latest Gemini release.