Gemini 3 Pro and Opus 4.5 Initial Thoughts | I'm Spoiled!

merefield · 26 November 2025 16:30

The speaker shares detailed impressions of AI coding models Gemini 3 Pro and Opus 4.5 after extensive use, praising Opus 4.5 for its affordability, deep code analysis, and design capabilities, while finding Gemini 3 Pro reliable but less revolutionary. They contrast these with GPT 5.1 and Composer 1, highlighting Cursor as the platform of choice, and express enthusiasm for ongoing evaluations and future AI coding advancements.

merefield · 26 November 2025 16:53

The speaker shares extensive insights after spending 12 to 14 hours a day coding over nine days, primarily working with tech stacks like Vue3, React, Python, TypeScript, PHP, and data engineering. They focus their feedback on several AI coding models, including Gemini 3 Pro, Opus 4.5, GPT 5.1, and Composer 1. Gemini 3 Pro, despite being highly hyped, is described as a good model that doesn’t revolutionize AI coding but performs well in practice. The pricing is reasonable, though it seems more token-hungry compared to others. The speaker appreciates the contact caching pricing, which helps reduce costs.

Opus 4.5 surprised the speaker positively, especially with its prompt caching feature that makes it more affordable to use. They find Opus 4.5 excellent at deep-diving into codebases, diagnosing root causes, and planning fixes. The speaker praises its design capabilities, highlighting smooth animations and thoughtful UI elements in the portfolio sites it generates. They contrast this with GPT 5.1, which they find underwhelming and somewhat erratic, especially in design tasks. GPT 5.1’s outputs are described as bizarre and not meeting expectations, leading the speaker to prefer other models for practical work.

The speaker also discusses their experience with Composer 1, which they find extremely fast and useful for quick answers, though it lacks some of the design sophistication seen in Gemini 3 Pro and Opus 4.5. They note that Composer 1 resembles some Chinese models in style but remains a favorite for speed and responsiveness. Despite some frustrations with Gemini 3 Pro occasionally quitting tool calls mid-task, the speaker enjoys using it for building complex projects and appreciates its ability to create detailed plans.

Cursor, the platform used to access these models, has improved significantly in transparency and speed, though the speaker dislikes its credit system. They mention switching between models depending on task complexity, often using Composer 1 for fast tasks and Opus 4.5 for more challenging problems. The speaker is currently on a $200/month plan with Cursor, primarily using Opus 4.5, which they consider their favorite model at the moment due to its balance of speed, cost, and capability.

Finally, the speaker is conducting evaluations of Gemini 3 Pro and Opus 4.5 for upcoming benchmarks and may include GPT 5.1 and other models. They acknowledge the challenges of balancing work with team transitions and emergency projects but appreciate the broad perspectives gained from interacting with hundreds of engineers about AI preferences. They invite feedback from viewers on their assessments and express eagerness to explore updates in other AI agents soon. Overall, the speaker feels Gemini 3 Pro and Opus 4.5 represent a significant step forward in AI coding assistance.