At Cloud Next, Google unveiled significant advancements in their AI infrastructure, including the development of fifth-generation TPUs and the introduction of Gemini 1.5 Pro for enhanced performance and long context understanding. These innovations position Google Cloud as a leader in the industry, offering efficient tools for developers, dynamic workload scheduling options, and extending AI capabilities to the edge, air gap environments, and cross-cloud through Google Distributed Cloud.
Google made significant announcements at Cloud Next regarding their advancements in AI infrastructure. They have been investing in AI infrastructure for over a decade, leading to the development of the fifth-generation TPUs. These advancements have enabled customers to train and deploy cutting-edge language models. Google Cloud’s AI platform has positioned them as leaders in the industry, with over 60% of funded AI startups and nearly 90% of gen unicorns being Google Cloud customers. They continue to enhance AI models, such as the latest Gemini 1.5 Pro, which offers improved performance and Long context understanding, enabling enterprises to create innovative solutions using AI.
Google has designed an AI hypercomputer to meet the increasing demands of large language models, making their system up to 2x more efficient at scale compared to baseline solutions. They have received recognition for their AI infrastructure solutions and strategy. Google Cloud is continuously enhancing its stack, including performance-optimized accelerators like TPUs and Nvidia GPUs. They announced the upcoming General availability of A3 Mega powered by Nvidia h100 tensor core GPUs, as well as support for Nvidia’s newest Grace Blackwell generation of GPUs.
Google has introduced the TPU v5p, their most powerful and scalable TPU, with 4X the compute capacity per pod compared to the previous generation. They have also enhanced their storage products to accelerate inference with HyperDisk ML, offering significantly greater throughput compared to competitors. Google optimizes open ML frameworks like Jacks, PyTorch, and TensorFlow for use with Vertex and GKE, providing developers with efficient tools to focus on training logic. Google is also launching new options for dynamic workload scheduling to improve resource management and cost efficiency significantly.
Google is bringing AI closer to where data is generated and consumed, extending to the edge, air gap environments, Google Sovereign clouds, and cross-cloud through Google Distributed Cloud. They are adding Nvidia GPU support, GKE capabilities for AI workloads, and enabling various open AI models to run on GDC. Google is offering vector search capabilities for search and information retrieval applications, including fully air gap configurations for sensitive data. They are supporting organizations like Orange to enhance network performance and provide responsive translation capabilities using AI on GDC.
Google introduced the Google Axian processor, their first custom ARM-based CPU designed for the data center, offering up to 50% better performance and 60% better energy efficiency than comparable x86-based VMs. They are deploying Google services on ARM-based instances, including Spanner, BigQuery, GKE, Google Earth Engine, and YouTube ads platform. Lastly, they showcased Vertex AI, their fast-growing Enterprise AI platform, offering over 130 models in the Vertex AI model garden. Gemini 1.5 Pro was taken into public preview, enabling the processing of vast amounts of information and cross-modality analysis. Gemini for Google Workspace and Vertex AI is empowering organizations to create AI-powered solutions for various use cases, including employee agents and video creation applications.