Replacing 12K LoC with a 200 LoC Skill — David Gomes, Cursor

David Gomes from Cursor explains how they replaced a complex 12,000-line codebase for managing git work trees with a streamlined 200-line markdown skill using AI agents and sub-agents, simplifying the system and enhancing flexibility. While this approach introduces challenges like ensuring agent isolation and discoverability, Cursor is addressing these through evaluation tests, reinforcement learning, and plans for a more integrated work tree experience in their upcoming Cursor 3.0 interface.

In this talk, David Gomes from Cursor discusses how they replaced a complex feature involving 12,000 lines of code with a much simpler 200-line markdown skill. The feature in question revolves around “git work trees,” which allow parallel work on different branches or tasks without interference. Originally, Cursor’s implementation of work trees was complex, involving extensive code for managing checkouts, setup scripts, isolation, judging, and cleanup. However, by leveraging existing Cursor primitives—specifically AI skills and sub-agents—they were able to recreate the functionality with a lightweight markdown skill, significantly reducing maintenance overhead.

David explains how the new implementation works through simple slash commands like /work tree and /best event, which spin up isolated agents working in separate work trees. These commands instruct the AI models to stay within their assigned work trees, run setup scripts, and even compare outputs from different models to help users choose the best solution. This approach not only simplifies the codebase but also enhances flexibility, allowing users to switch work trees mid-chat and support multi-repo setups, which the previous implementation could not handle.

Despite these advantages, the new markdown-based approach has some drawbacks. The biggest challenge is ensuring that AI agents consistently stay within their assigned work trees, as the system now relies on prompting rather than enforced isolation. This can lead to occasional deviations, especially with less capable models. Additionally, the new method feels slower to users because the creation of work trees is visible in the chat, and the feature is less discoverable since it requires users to know and type the slash commands manually.

To address these issues, Cursor is working on improving the reliability of the agents through evaluation tests (evals) and reinforcement learning (RL). They are developing evals to monitor whether models operate correctly within their work trees and using these insights to refine prompts and system reminders. Furthermore, Cursor plans to integrate a more native and complete work tree experience in their upcoming Cursor 3.0 agent interface, which is designed to optimize the coding workflow around AI agents and chat.

Looking ahead, Cursor is also exploring alternative parallelization methods beyond git work trees, which have limitations such as slow creation times, high disk usage, and dependency on git repositories. These new primitives aim to provide more efficient and versatile local parallelization options for users working with different version control systems or environments. Overall, the talk highlights a significant shift towards simplifying complex features with AI-driven markdown skills while continuously improving user experience and system robustness.