Don't Build Slop (4 Levels of AI Agent Maturity) - Ara Khan, Cline

Ara Khan outlines a four-level framework for building AI agents, progressing from quick prototyping with existing frameworks to creating custom state-machine agents, managing workflows with Kanban boards, and ultimately deploying scalable cloud-based agents. He emphasizes thoughtful design, simplicity, and effective management over rushing to deploy numerous agents, advocating for gradual refinement to achieve robust, scalable AI systems.

In this talk, Ara Khan discusses the challenges and best practices in building AI agents, emphasizing the importance of thoughtful design over rushing to deploy numerous agents. He highlights a common confusion among developers about how many agents to use and how to manage them effectively. To address this, he proposes a framework of four levels of AI agent maturity, ranging from using existing frameworks for quick prototyping to building custom state-machine-based agents, adopting Kanban boards for workflow management, and finally deploying agents at scale in the cloud.

The first level involves using existing AI agent frameworks like LangChain or LangGraph to quickly test if an AI agent can solve a problem. This approach is suitable for early experimentation or rudimentary tasks but often lacks the customization and modularity needed for production-grade systems. Ara cautions that while frameworks are useful for rapid prototyping, they may not meet the demands of serious, scalable applications.

At the second level, Ara advocates building agents from scratch using a state machine model. He explains that every agent can be conceptualized as a recursive loop with defined states and transitions, which helps maintain clarity and control over the agent’s behavior. He shares five key rules for building agents: treat agents as state machines, keep the system simple to avoid overwhelming the model, enable easy testing and iteration through CLI tools, avoid sloppy coding practices, and be wary of vendor lock-in from Frontier Labs APIs that can limit flexibility and performance.

The third level focuses on the user experience and workflow management of agents, where Ara recommends using Kanban boards as an effective interface. Kanban boards help manage multiple agents running in parallel by providing a clear overview of tasks, their states, and dependencies. This approach allows developers to act as engineering managers overseeing their agents, improving coordination and productivity, especially when agents are inference-bound and take significant time to complete tasks.

Finally, the fourth level involves deploying AI agents in the cloud to achieve scalability and ease of collaboration. Cloud agents can run long tasks independently, handle complex workflows, and allow multiple users to share and mutate setups seamlessly. Ara envisions a future where most agent interactions happen through Kanban-style interfaces, while the heavy computation and execution occur in the cloud, enabling efficient scaling to millions of tasks and users. He concludes by encouraging developers to start simple and progressively refine their agent systems using these levels as heuristics.