The video “Build Hour: Agent RFT” explores Agent Reinforcement Fine-Tuning, a technique that enhances autonomous agents by fine-tuning model weights with custom reward signals to improve tool usage, reasoning, and task efficiency, demonstrated through financial benchmarks and real-world applications like Cognition’s AI engineer Devon. It also highlights diverse success stories across industries and offers practical guidance on implementing Agent RFT for well-defined tasks with quality data and reward functions.
The video titled “Build Hour: Agent RFT” features Christine from the startup marketing team alongside Will from engineering and Theo, a solutions architect, discussing the advanced topic of Agent Reinforcement Fine-Tuning (Agent RFT). They begin by explaining the concept of agents—models that interact autonomously with external tools to complete tasks without constant human intervention. Examples include coding agents accessing terminals or customer service agents interfacing with billing systems. The team highlights how agents integrate tool outputs back into their reasoning process, enabling iterative problem-solving.
They then delve into the evolution of OpenAI’s agent products, such as CodeX and Deep Research, which utilize various tools to enhance task completion. While prompt engineering remains a foundational method to optimize agent performance, the speakers emphasize that fine-tuning, particularly Agent RFT, offers a more powerful approach. Agent RFT adjusts model weights based on custom reward signals, allowing the agent to explore different tool usage strategies during training. This method improves reasoning, tool efficiency, and overall task performance, often resulting in reduced latency and sample-efficient learning.
A detailed demonstration follows, showcasing Agent RFT applied to a challenging financial question-answering benchmark. The task requires the agent to locate relevant financial reports among thousands of documents using limited tool calls. The team explains the setup of tools like semantic search and document retrieval, as well as the use of a model-based grader to provide nuanced reward signals that account for partial correctness. They present training results showing significant improvements in accuracy, reduced tool calls, and lower latency, illustrating how the agent learns to use tools more effectively and efficiently.
The video also features a customer spotlight with Sam Pretty from Cognition, who shares real-world applications of Agent RFT in their autonomous AI engineer, Devon. By fine-tuning GPT-5 models with Agent RFT, Cognition improved Devon’s planning speed and accuracy in navigating codebases, reducing the number of back-and-forth interactions needed to begin editing. Sam discusses the infrastructure challenges of running isolated environments for tool calls and graders, as well as the importance of robust monitoring to handle failures during training. This case exemplifies how Agent RFT can enhance specialized agents in production settings.
Finally, the hosts share additional success stories from diverse domains including healthcare, slide creation, GPU kernel development, and financial reasoning. They highlight how Agent RFT has enabled significant performance gains, reduced latency, and improved model behavior across these varied applications. The session concludes with practical advice on when to use Agent RFT, emphasizing the need for well-defined tasks, quality datasets, and carefully designed reward functions. They encourage viewers to explore the platform, engage with the team, and look forward to upcoming build hours focused on agent memory patterns.