This New OpenAI Leak Changes Everything About GPT-6

artesia · 10 March 2026 03:54

A recent leak from OpenAI employees reveals that the company is developing a new “Omni” multimodal AI model and an advanced bidirectional audio interaction system (BAI), both of which are expected to be central to the upcoming GPT-6 release. OpenAI is also working on AI-powered hardware like earbuds and smart devices, aiming to create an always-present, context-aware “ambient AI ecosystem” that moves beyond traditional app-based interactions.

artesia · 10 March 2026 04:19

A recent leak from OpenAI employees has sparked speculation about the upcoming GPT-6 model and a new “Omni” model, which promises to be a true multimodal AI system. The leak began with a post from Atai Alleti, a member of OpenAI’s voice team, who hinted at the development of a new Omni model. Other OpenAI employees chimed in, confirming that a successor to GPT-4o (where the “o” stands for Omni) is in the works. GPT-4o was initially intended to be a unified model capable of processing text, images, and audio natively, but it fell short of expectations, with many features either limited or unreleased. The new Omni model aims to fulfill that original vision by creating a single system that can handle all input types simultaneously, rather than relying on separate subsystems.

A key advancement OpenAI is working on is a new audio model called BAI (short for bidirectional audio interaction). Unlike current voice AIs, which operate in a turn-based, walkie-talkie style, BAI would allow for natural, overlapping conversation—much like talking to another person. This would enable the AI to process interruptions, acknowledgments, and real-time feedback, making interactions feel more fluid and human. Although a prototype exists, it still has glitches and is not expected to launch until at least the second quarter of 2026. This technology could revolutionize customer support and make AI more accessible to people who prefer speaking over typing.

The development of the Omni model and BAI is closely tied to the anticipated release of GPT-6. OpenAI CEO Sam Altman has confirmed that GPT-6 is already in development and progressing faster than GPT-5. OpenAI’s partnership with AMD will provide the massive computing power needed to train such advanced models, with the first gigawatt of capacity coming online in late 2026. The expected timeline suggests a developer preview of GPT-6 in late 2026, with a broader rollout in early 2027. GPT-6 is expected to introduce persistent memory (remembering users across sessions), autonomous agent capabilities (taking actions on users’ behalf), and full native multimodality, potentially using the Omni model as its backbone.

OpenAI is also investing heavily in hardware to bring this ambient AI ecosystem into everyday life. They have a team of over 200 people working on physical devices, including AI-powered earbuds (codenamed Gumdrop), a smart speaker with a camera, smart glasses, and a mysterious pocketable device designed in collaboration with Jony Ive. These devices are intended to integrate seamlessly with the Omni model and BAI, providing real-time, context-aware AI assistance. The earbuds will feature custom processors for on-device AI, and the smart speaker will use visual context to enhance interactions. The ambitious sales targets and partnerships with major manufacturers like Foxconn indicate OpenAI’s intent to compete at the scale of established consumer tech products.

The broader vision is to move beyond the current app-based interaction with AI and create an “ambient AI ecosystem” that is always present and accessible—whether in your ear, on your counter, or eventually as wearable tech. Unlike previous failed attempts at AI hardware (such as the Humane AI Pin or Rabbit R1), OpenAI’s massive user base and collaboration with renowned designer Jony Ive give it a significant advantage. By combining a unified multimodal brain (Omni model), natural voice interaction (BAI), and persistent, intelligent assistance (GPT-6), OpenAI aims to redefine how people interact with technology in a post-smartphone world.