Leon demonstrates how switching from Claude Code to the open-source OpenCode, paired with Olama and local models like Kraina 3.6, provides a faster, more accurate, and efficient coding agent setup that overcomes the limitations of large system prompts and hallucinations. He walks through the full installation, configuration, and phased project implementation process, highlighting OpenCode’s ability to handle complex coding tasks locally with improved control, testing, and bug fixing capabilities.
In this video, Leon shares his experience trying to get Claude Code to work with a local model, which proved frustrating due to the model’s inability to follow instructions and hallucination of tool calls. He discovered that Claude Code consumes nearly 30,000 tokens on system prompts and tools before processing any user input, overwhelming local models. To overcome this, he switched to OpenCode, an open-source, free coding agent that works efficiently with local models and follows instructions accurately. Leon introduces OpenCode as a simpler and more effective alternative for local coding agents, especially when paired with Olama, a tool for running free models on local hardware.
Leon then walks through the installation and setup process for OpenCode and Olama. He explains how to install OpenCode via NPM and Olama from its website, followed by downloading appropriate models based on the user’s hardware capabilities. He recommends specific models like JEMMA 4 for laptops with limited VRAM and Kraina 3.6 for machines with higher VRAM, highlighting Kraina 3.6 as his preferred model for coding tasks. After downloading and testing the models with Olama, Leon demonstrates how to connect OpenCode to Olama, including editing configuration files to add new models if necessary.
Next, Leon showcases how to use OpenCode with Olama in a real coding project. He creates a new project folder and launches OpenCode with the selected model, testing simple commands to confirm functionality. He highlights OpenCode’s ability to switch between build and plan modes, using it to generate a detailed implementation plan for a local chat app that uses Olama models for inference and response streaming. Leon emphasizes the importance of breaking down large tasks into smaller, focused instructions to avoid overwhelming the model, and demonstrates how to split the implementation plan into actionable phases stored in separate folders.
Leon proceeds to implement the project in phases, instructing OpenCode to work on one phase at a time to maintain focus and control. He notes that the agent can run type checks and linting automatically, helping ensure code quality. As the project progresses, Leon tests the app and identifies issues such as the lack of response streaming and console errors. To address these, he equips the agent with a browser skill using Playwright, allowing it to test the app autonomously and fix bugs. This addition enables the agent to stream responses correctly and interact with the app more effectively.
In conclusion, Leon demonstrates that combining OpenCode with Olama and local models like Kraina 3.6 provides a powerful, free, and local coding agent solution that outperforms Claude Code in speed and accuracy. He encourages viewers to try this setup themselves and highlights the benefits of using lightweight, focused prompts and modular project phases for better results. The video ends with Leon inviting viewers to like and subscribe for more coding tutorials, motivating them to build their own projects using these tools.