Google just released the STABLE build of Gemini 2.5 (including a new model!)

Google has released the stable Gemini 2.5 AI models, featuring the new Gemini 2.5 Flash Light, which offers unprecedented speed, efficiency, and a massive 1 million token context length for handling extremely long inputs across multiple modalities. These models excel in coding, multimodal understanding, advanced reasoning, and tool use, supported by extensive training and safety measures, marking a significant advancement and broad availability for diverse AI applications.

Google has officially released the stable Gemini 2.5 series of AI models, including the new Gemini 2.5 Flash Light, which is their fastest and most cost-efficient model to date. These models are now generally available for production use and come with extensive support from Google. The Gemini 2.5 family is notable for its high performance in coding tasks, speed, efficiency, and cost-effectiveness. A standout feature across the Gemini family is the unprecedented 1 million token context length, enabling the models to handle extremely long inputs such as entire books, codebases, and long-form audio or video data.

The Gemini 2.5 models are built as sparse mixture of experts architectures, meaning they contain multiple expert subnetworks but only activate a subset for each query, optimizing computational efficiency. They natively support multimodal inputs including text, audio, images, video, and code repositories, and have native tool use capabilities such as Google search and code execution. The models also feature controllable thinking budgets, allowing dynamic reasoning processes that improve accuracy and factuality. The training data is diverse and large-scale, covering multiple domains and modalities with a cutoff date of January 2025 for Gemini 2.5.

Google’s technical report reveals that the smaller Gemini 2.5 models are distilled versions of the larger ones, trained to maintain performance while reducing size and cost. The models were trained on Google’s TPU V5P chips, contributing to their speed and efficiency. Reinforcement learning with verifiable and model-based generative rewards was heavily used in post-training to enhance reasoning and factual accuracy. The models can perform complex coding tasks, support multimodal interactive scenarios, and integrate advanced reasoning with tool use, such as interleaving search queries within their thought processes.

Gemini 2.5 also excels in video understanding and audio generation, with improvements allowing it to process longer videos more efficiently. A notable demonstration was Gemini’s ability to play and beat the Pokémon game using an agentic system, although it struggled with direct pixel reading from the game screen and showed some limitations in long-context generative reasoning. The report also highlights Google’s focus on AI safety, including automated red teaming using AI agents to test vulnerabilities and efforts to minimize memorization of personal or copyrighted information.

Overall, Gemini 2.5 represents a significant leap forward in AI capabilities, combining speed, efficiency, multimodality, and advanced reasoning. The models outperform previous Gemini versions in tasks like image-to-SVG conversion and video question answering, showing much higher accuracy and contextual understanding. With general availability, developers now have access to powerful tools for a wide range of applications, from coding and translation to video analysis and interactive AI agents. The release marks a strong comeback for Google in the competitive AI landscape.