GPT-Realtime-2, Directionally Bad and Agent Memory

The livestream features a discussion with Richmond Delacay from Oracle on advancements in AI agent memory, highlighting innovations like Anthropic’s “dreaming” feature that consolidates memory during off-peak times to improve efficiency and accuracy. They explore the technical aspects and future challenges of memory in AI, emphasizing the importance of memory engineering over model weights and encouraging engagement with emerging tools and research in the field.

The livestream begins with the host greeting a global audience and discussing a striking image by Beeple. The conversation quickly shifts to recent news involving Sam Altman’s brief firing from OpenAI, highlighted by revealing text messages between Altman and Mir Marati that convey the tense and uncertain atmosphere during that period. The host reflects on the relatable nature of these messages and previews upcoming content about the collaboration between Dario and Elon Musk. The stage is then set for a deep dive into AI agent memory with a special guest, Richmond Delacay from Oracle, who is introduced as an expert in the field.

Richmond opens the discussion by explaining the new “dreaming” feature introduced by Anthropic, which enhances agent memory by reviewing past sessions to consolidate and improve memory patterns. He draws parallels between this technical process and human dreaming, which helps consolidate memories and clear unnecessary information. Richmond also references prior research like the MemGPT paper and Sleep Time Compute, emphasizing that these ideas have been evolving for years. The feature aims to reduce computational costs by offloading memory consolidation to non-peak times, thereby improving response quality and efficiency during active use.

The conversation then explores the broader landscape of agent memory, including OpenAI’s recent improvements that incorporate human feedback to refine memory accuracy. Richmond highlights foundational research such as the Stanford Simulacra Generative Agents paper, which demonstrated emergent human-like behaviors in AI agents with memory. They discuss the challenges and future of memory in AI, including whether continual learning—where model weights are updated in real-time—is necessary or if current architectures with external memory systems and in-context learning are sufficient for significant progress.

Richmond provides a technical overview of agent memory, describing its types (short-term, long-term, procedural, episodic, semantic, and shared memory) and the importance of context engineering to manage the limited context window of large language models. He stresses the need for memory engineering as a discipline to optimize how AI systems remember, forget, and consolidate information. The discussion also touches on practical implementations, such as Oracle’s agent memory package, which demonstrates improved token efficiency and accuracy over naive memory approaches, and the trade-offs involved with latency and operational costs.

In closing, Richmond emphasizes that the future of AI lies in the sophisticated harnessing of memory and context rather than solely in the underlying model weights. He encourages developers to engage with the growing ecosystem of tools and research, noting that memory engineering offers a level playing field for innovation. The livestream ends with a call to explore shared resources, including a free deep learning course on agent memory, and a promise to revisit new developments like OpenAI’s GPT real-time voice features in future streams.