IBM Think 2025, OpenAI Windsurf acquisition, reasoning models and hallucinations

The video discusses IBM Think 2025’s focus on advancements in generative computing, efficient domain-specific models like Granite 4, and efforts to reduce hallucinations in reasoning AI through hybrid approaches. It also covers OpenAI’s potential $3 billion Windsurf acquisition as part of a trend toward vertical integration to control both AI models and applications for a comprehensive user experience.

The video features a discussion about the highlights and key announcements from IBM Think 2025, focusing on advancements in AI and computing. Kate Soule, Director of Technical Product Management for Granite, highlights the significance of the research keynotes, particularly the emergence of generative computing, which combines traditional, quantum, and new model-based computing approaches. Kaoutar El Maghraoui, Principal Research Scientist, emphasizes the launch of over 150 enterprise-ready AI agents via the Watsonx Orchestrate platform, showcasing IBM’s push towards automating tasks and enhancing productivity with modular, customizable AI solutions. Skyler Speakman adds a lighter note, appreciating the fun atmosphere of the keynote, exemplified by a mascot penguin, which reflects the conference’s engaging and lively spirit.

A major focus of the conference was IBM’s introduction of generative computing, which aims to bring software engineering principles into the realm of AI models. Kate explains that current prompt engineering methods are brittle and hard to maintain, and that the future lies in creating abstractions and control flows similar to traditional programming. This approach involves building more structured, maintainable systems that leverage the power of large language models (LLMs) while reducing risks like hallucinations and inefficiencies. The discussion highlights ongoing efforts to scale inference, optimize compute, and develop tools that enable more reliable and secure AI deployment at scale.

The conversation then shifts to IBM’s announcement of the Granite 4 models, a new family of mixture-of-experts hybrid models that are highly efficient and domain-specific. Kaoutar describes these models as small, fast, and energy-efficient, with sizes ranging from three to twenty billion parameters, designed to complement larger models. These models are intended to serve enterprise needs where cost, speed, and specificity are critical, offering a practical alternative to massive, general-purpose models. The emphasis on efficiency and specialization reflects a broader trend toward tailored AI solutions that can be deployed at scale in various industries.

The discussion also covers the challenge of hallucinations in reasoning models, which has become more prominent as models improve in reasoning capabilities. Skyler notes that hallucination rates appear to be increasing, despite ongoing research efforts. Kate and Kaoutar attribute this to misaligned incentives in training objectives, where models are optimized for verbosity and persuasive responses rather than factual accuracy. They suggest that hybrid approaches combining symbolic reasoning, factual checks, and layered security tools like Granite Guardian models are necessary to mitigate hallucinations and improve reliability, especially in high-stakes enterprise applications.

Finally, the conversation turns to OpenAI’s potential acquisition of Windsurf, a coating environment company, for around $3 billion—its largest acquisition to date. The panel discusses whether this move indicates that AI is more about marketing than technology, or if it reflects a strategic effort to control both the model and application layers. Skyler and Kaoutar see this as part of a broader trend where AI companies aim to vertically integrate, owning both foundational models and the ecosystems or workflows built around them. Kate compares this to Apple’s integrated approach, suggesting that future AI dominance may depend on controlling the entire user experience, from core models to developer tools and specialized applications.