Actually coding with Claude 3.7 is actually insane, actually

merefield · 26 February 2025 18:08

In the video, the presenter reviews Claude 3.7 Sonet, highlighting its impressive benchmark scores and improved coding capabilities compared to its predecessor, Claude 3.5. Through a hands-on coding session, they demonstrate Claude’s effectiveness in creating a Go backend API, handling Docker containers, and developing a functional front-end, ultimately concluding that Claude 3.7 is a powerful tool for developers despite some minor limitations.

merefield · 26 February 2025 18:29

In the video, the presenter discusses the recent release of Claude 3.7 Sonet, an upgrade from Claude 3.5 Sonet, which has been recognized as a leading AI coding assistant. The presenter highlights the impressive benchmark scores of Claude 3.7, noting its accuracy of 62.3% on the SweetBench verified tests, which can be boosted to 70.3% with custom scaffolding. In comparison, other AI models like OpenAI’s 3.5 and 03 mini have lower accuracy rates. The presenter expresses curiosity about how Claude 3.7 performs in practical coding scenarios, especially given the mixed reviews surrounding its capabilities.

The video transitions into a hands-on coding session where the presenter uses Claude 3.7 Sonet to create an API route in a Go backend for retrieving data from a Neon PostgreSQL database. The initial attempts to test the API endpoint result in a 404 error, which the presenter resolves by explicitly providing the database URL. After successfully retrieving a null response, the presenter realizes that the API is functioning correctly but lacks data in the database. They proceed to insert test data and restructure the project for better organization, demonstrating Claude’s ability to assist in coding tasks.

As the coding session progresses, the presenter encounters some challenges, such as errors related to database connections and the AI’s tendency to hallucinate the existence of certain files. Despite these issues, Claude 3.7 Sonet shows improvement in project structuring and error handling compared to its predecessor. The presenter tests the AI’s capabilities further by prompting it to create a Docker container for executing user-submitted code, which Claude handles effectively, generating multiple files and directories with minimal errors.

The presenter also explores the front-end development aspect, creating a user interface for interactive coding challenges. Claude 3.7 Sonet impressively generates a functional front-end that integrates with the backend and database. Although there are minor issues with the code, such as unused imports and some styling problems, the overall output is satisfactory and demonstrates the AI’s ability to produce working code quickly. The presenter appreciates the breadth of Claude’s capabilities, noting that it can handle multiple files and tasks simultaneously.

In conclusion, the presenter reflects on their experience with Claude 3.7 Sonet, noting that it feels significantly improved over 3.5, particularly in terms of agentic tool use and overall coding efficiency. While there are still some limitations and occasional errors, the advancements in Claude 3.7 make it a powerful tool for developers. The presenter invites viewers to share their thoughts and experiences with the new version, ultimately asserting that Claude remains the king of AI coding assistants.