DeepMind’s new AI, Sema 2, can learn and play multiple modern 3D video games simultaneously by understanding raw pixels and complex human instructions, showcasing advanced multimodal capabilities like voice and sketch recognition. It also demonstrates the ability to transfer knowledge to unseen games and adapt to procedurally generated environments, marking a significant step toward more general and human-like artificial intelligence.
Google DeepMind has unveiled an impressive new AI called Sema 2 that can play multiple modern 3D video games simultaneously. Unlike previous AI models, Sema 2 learns directly from raw pixels and controls using a keyboard and mouse, mimicking human learning without any shortcuts. This AI not only understands the 3D environments it navigates but also follows complex human instructions, improving its performance across different games as it gains experience. This marks a significant leap from earlier AI systems that could only handle simpler tasks or single games.
The first version of Sema was limited in its ability to plan long-term strategies, managing only short-term goals like avoiding obstacles. However, Sema 2 introduces multimodal capabilities, including voice command recognition and even interpreting rough sketches, allowing it to create plans based on these inputs. This advancement enables the AI to execute more complex tasks, such as gathering resources or exploring caves, which were previously beyond its reach. The AI also demonstrates an understanding of nuanced instructions, including reverse psychology and emoji commands, showcasing a deeper level of comprehension.
One of the most remarkable features of Sema 2 is its ability to transfer knowledge learned from one game to entirely new, unseen games. For example, it was able to play Minecraft reasonably well despite never having encountered it before, relying solely on its experience from other games. While its success rate in new games is still modest—around 14% compared to near zero for the previous version—this jump represents a critical breakthrough in generalizing AI learning. The expectation is that future iterations will dramatically improve this capability, potentially reaching success rates of 80 to 90%.
DeepMind also tested Sema 2 in procedurally generated games created by another AI, demonstrating its adaptability to new worlds with different art styles and mechanics. The AI learns through trial and error, gradually improving its performance in these unfamiliar environments, much like a human child exploring and learning from the world around them. This approach signals a shift from pre-programmed knowledge toward AI that grows through curiosity and interaction, aiming to solve broader intelligence challenges rather than just mastering games.
Despite its impressive progress, Sema 2 still has limitations, including relatively low success rates and slow learning speeds, with some footage sped up for demonstration purposes. Nevertheless, this project represents a foundational step toward creating AI systems capable of learning and adapting in complex, dynamic environments. DeepMind’s work with Sema 2 offers a glimpse into the future of AI—one that can understand, learn, and assist in novel tasks beyond predefined scenarios, moving closer to genuine intelligence.