Your skills.md have an expiration date. Here's what do to about it

The video emphasizes that software engineers should focus on “harness engineering,” a layered framework involving feed forward and feedback mechanisms to guide and improve agent behavior, rather than worrying about leaked Cloud Code source. It also highlights the importance of adapting harnesses over time as models evolve, recommending evaluation methods akin to unit testing to maintain and optimize these systems continuously.

In this video, the speaker addresses the recent leak of Cloud Code source code, emphasizing that it is not a critical concern since Cloud Code is merely a wrapper around large language models (LLMs) like Sonnet or Opus. Instead of focusing on the underlying code, software engineers should concentrate on the higher-level abstraction known as “harness engineering.” The speaker introduces a framework from an article by Burgitta Buckler of ThoughtWorks, which categorizes the layers involved in coding agents: the core LLM, the agentic harness (such as Cloud Code), and the outermost user harness layer where developers have direct control.

The user harness layer is divided into two main mechanisms: feed forward and feedback. Feed forward mechanisms include everything provided to the agent before it starts working, such as skill files, custom rules, language servers, and static type systems, which help improve output quality. Feedback mechanisms, on the other hand, involve processes like static code analysis, code reviews, unit tests, and logs that allow the agent to self-correct and improve iteratively. This dual-layer approach helps developers steer and refine the agent’s behavior both synchronously and asynchronously.

The speaker appreciates this framework but points out a missing dimension: time. The harness that works well today may not be suitable tomorrow as models evolve and improve. While some elements like documenting architectural decisions and test automation will remain evergreen, others such as specific skills or MCP servers might become obsolete as models become more capable. The speaker advises building for today’s models but anticipating future improvements, highlighting the importance of adaptability in harness engineering.

To manage changes in the harness effectively, the speaker recommends adopting evaluation methods similar to unit testing in software engineering. By designing experiments and running tests with different harness configurations, developers can measure the impact of adding, tweaking, or removing components. This approach helps maintain and improve the harness over time, especially when new models are released. For larger organizations, investing in automated evaluation infrastructure can enable continuous improvement, while smaller teams can experiment informally to optimize their setups.

In conclusion, the speaker encourages conscious and structured harness engineering using the feed forward and feedback framework, while also considering the temporal evolution of models. They share a personal example of how a pre-commit hook to strip code comments became unnecessary as models improved. The video ends with an invitation for viewers to share their own experiences with agentic harnesses, fostering a community discussion on best practices and evolving strategies in harness engineering.