The Race for AGI – Open Source vs Closed Source

The speaker from UNSLOF highlights the progress and challenges in open source AI, emphasizing their contributions to improving model performance, tooling, and transparency, which have helped narrow the gap with closed source models. They also discuss the evolving AI landscape, noting that while closed source labs lead in some areas, open source efforts foster collaboration and innovation crucial for democratizing AI development and advancing toward AGI.

The speaker begins by introducing their organization, UNSLOF, which is heavily involved in improving open source AI models by fixing bugs and enhancing training processes. They collaborate globally with major labs like Meta, OpenAI, and Google, contributing fixes to popular models such as LLaMA and Gemma. UNSLOF has achieved over 200 million downloads on Hugging Face, making them one of the largest distributors of AI models. Beyond bug fixes, they have introduced innovations like asynchronous gradient checkpointing and memory-efficient attention mechanisms, which have improved training efficiency and model accuracy.

The presentation then delves into AI model performance trends using the METER benchmark, illustrating how model capabilities have exponentially improved over time, especially after the introduction of reasoning models like OpenAI’s GPT-4. Reasoning capabilities have accelerated AI progress, reducing the doubling time of model performance to about four months. However, other benchmarks like the WeirdML index suggest that reasoning models, while better, may not be drastically superior in all scientific problem domains. The speaker also highlights the importance of long-context benchmarks, showing that open source models currently lag behind closed source models in handling very long inputs but are steadily improving.

A significant portion of the talk focuses on the evolving gap between open source and closed source AI models. Historically, closed source models led by about six months, but after reasoning models emerged, this gap widened to around 16 months. Recently, open source models have caught up to being only four to six months behind, largely due to adopting reasoning techniques. The speaker expresses skepticism that closed source labs will regain a large lead without new paradigm shifts beyond reasoning. They also emphasize that running open source models multiple times can compensate for lower single-run accuracy, effectively boosting performance.

The discussion then shifts to the critical role of tooling and harnesses in AI model performance. The speaker explains that accuracy fluctuations in closed source models like Anthropic’s Claude code often stem from issues in the harness rather than the model itself. Open source models face similar challenges, but their transparency exposes these problems more visibly. UNSLOF works on fixing tool call failures and improving integration with external tools like web search, which significantly enhances open source model accuracy and inference speed. They also caution about specific harness-related pitfalls, such as system prompt injections that can slow down inference, and share best practices to optimize performance.

Finally, the speaker addresses community engagement and the future of AI development. They note that open source models appear more buggy because all issues are visible, unlike closed source models where problems are hidden. UNSLOF prioritizes fixing open source models to democratize AI and encourages community involvement. They highlight ongoing efforts to improve communication about bug fixes and model updates. The talk concludes with an overview of large-scale reinforcement learning efforts by closed source labs aiming for AGI, while open source projects leverage similar tools and techniques. Workshops and resources are available for developers interested in training and fine-tuning models on AMD hardware, emphasizing the collaborative and open nature of the open source AI ecosystem.