Claude Code Let's Build: AI Video Editor

In this video, the creator builds a semi-autonomous AI video editor using Claude Code, integrating tools like ffmpeg, OpenAI’s Whisper, and Gemini 3 Flash API to enable both manual and AI-driven video editing through a user-friendly Next.js interface. The project demonstrates how users can perform complex video edits with simple natural language prompts, showcasing the potential of combining cloud code, AI models, and modern web development.

In this video, the creator embarks on a weekend project to build a semi-autonomous AI video editor using Claude Code, focusing on demonstrating how cloud code can be leveraged to generate and execute project ideas. The main components required for the project include ffmpeg for video processing, OpenAI’s Whisper model for local audio-to-text transcription, and the Gemini 3 Flash API for AI-driven features and function calling. The setup process involves creating a new project directory, gathering relevant API documentation, and preparing environment variables, such as the Gemini API key. The creator emphasizes the importance of having these tools and documentation ready before starting the actual coding.

The project is structured around building a simple, user-friendly UI using the Vercel tech stack, specifically Next.js and JavaScript. The initial planning phase is handled by Claude Code, which generates a comprehensive project plan after asking clarifying questions about model choices and desired features. The plan includes components like a file uploader, video player, timeline, clip editor, ffmpeg integration, and an AI transcription pipeline. The creator opts for the Whisper small model for efficiency and confirms the inclusion of features like drag-and-drop video upload and a basic timeline for editing.

Once the plan is accepted, Claude Code begins generating the application. After installing dependencies and running the development server, the creator tests the first iteration of the video editor. The UI allows for intuitive actions such as uploading a video, seeking through the timeline, cutting and deleting clips, and reordering segments. The creator identifies and resolves minor issues, such as merging clips after deletion, ensuring that the timeline updates correctly and that the exported video reflects the edits made in the UI.

With the basic editing functionality working, the focus shifts to AI-powered editing. The user uploads a video and initiates transcription using Whisper. Once the transcript is ready, the creator demonstrates how to use natural language prompts to instruct the AI editor—powered by Gemini—to keep only specific parts of the video, such as segments where “Claude goes off the rails” or where “Claude made money.” The AI successfully identifies and extracts relevant clips based on the prompts, and the exported videos are verified to match the requested content.

The project concludes with the creator expressing satisfaction with the results, highlighting the success of both manual and AI-driven editing features. The video editor allows users to perform complex edits using simple natural language instructions, showcasing the power and flexibility of combining cloud code, AI models, and modern web development frameworks. The creator hints at the possibility of adding more advanced features, such as transitions or additional AI models, in future videos, and encourages viewers to explore building similar projects with cloud code.