Sholto Douglas discusses the future of AI models, envisioning a shift towards dynamic bundles of compute with infinite context, where distinctions between small and large models blur. He emphasizes the importance of understanding long-horizon task performance, highlighting the evolving benchmarks in AI technology and the need for human oversight to guide the development and deployment of advanced AI technologies.
Sholto Douglas discusses the future of AI models in his talk. He envisions a future where the distinction between small and large models diminishes, and fine-tuning may become less necessary. Instead of having different tiers of model sizes, there could be a dynamic bundle of compute with infinite context that specializes models for various tasks. The inability of current models to perform tasks on long horizons, engaging with a task for many hours or weeks, has been identified as a bottleneck for AI progress. However, Douglas challenges the notion that this is the primary reason AI agents haven’t taken off, suggesting issues of reliability and task chaining as more critical factors.
He emphasizes the importance of understanding long-horizon task performance and the economic impact of AI models. By evaluating success rates over various time resolutions, such as minutes, hours, or days, one can assess the automatability of different job or task families. Douglas mentions the surprise introduction of 100K context windows less than a year ago, highlighting the evolving benchmarks in AI technology. He envisions AI firms being end-to-end trained on signals like profitability or client satisfaction, with specialized agents handling different aspects of the business.
Douglas acknowledges the potential of reinforcement learning in creating models that learn from sparse signals over time. However, he cautions that such advancements will require careful oversight and guidance from humans to ensure the models behave as intended. He discusses the challenge of training models based on reinforcement learning rewards when the model itself needs to generate those rewards. Douglas points out that in the future, models may become good enough to receive rewards some of the time, highlighting the importance of reliability in AI systems.
In conclusion, while Douglas sees a future where AI firms may operate as a single end-to-end trained system, he believes that in the near term, AI agents will likely remain interconnected components due to the need for reliability and trust. He stresses the significance of providing models with the right signals and feedback to improve their performance and achieve desired outcomes. Despite the potential of AI models to evolve and perform complex tasks, Douglas underlines the essential role of human oversight and careful management in guiding the development and deployment of advanced AI technologies.