Amazon’s AI Roadmap With AWS CEO Garman

artesia · 2 June 2025 18:21

AWS CEO Matt Garman highlights the rapid growth and integration of AI across Amazon and AWS, with AI revenue reaching billions and most workloads focused on inference embedded in applications. He also discusses advancements in AI hardware, such as Project Titan and custom accelerators, along with AWS’s global expansion to support increasing customer demand for AI-driven solutions.

artesia · 2 June 2025 18:41

In the video, AWS CEO Matt Garman reflects on his first year in the role, highlighting the rapid pace of innovation and customer adoption of new AI technologies. He emphasizes that many customers are migrating their entire operations to the cloud, driven by the explosion of AI and generative technologies. Garman notes that AWS’s AI business has reached a multi-billion dollar run rate, with a mix of customers running their own models, utilizing Amazon Bedrock, and applying AI across various applications like Amazon Queue and Alexa. He underscores that AI is at the early stages of transforming industries and jobs, with the current revenue just the beginning of its potential.

Garman discusses the significant revenue generated from generative AI on AWS, which is in the multiple billions of dollars. He explains that AI is integrated into many Amazon services, including retail, fulfillment centers, Alexa, and customer contact centers, showcasing how pervasive AI usage is across Amazon’s ecosystem. Customers are leveraging AI for efficiency, enhanced user experiences, and automation. He highlights that AI workloads are increasingly focused on inference, which involves embedding AI capabilities into applications, rather than training, which is more resource-intensive and less common in current usage.

The conversation also covers the importance of inference as the core building block of AI applications, comparable to compute or storage. Garman notes that most AI usage today is inference, with estimates suggesting that 80-90% of AI workloads will eventually be dedicated to inference. This shift reflects AI’s integration into everyday applications, where it enhances functionality and user interaction. He emphasizes that AI is no longer a separate component but embedded within applications, making it difficult to attribute revenue to AI alone, as it becomes a fundamental part of the user experience.

Regarding technological advancements, Garman provides updates on Project Titan, a collaboration with Anthropic to build a massive compute cluster for training next-generation AI models. The project involves deploying Amazon’s custom-built accelerators, Uranium Two, to support large-scale model training. He highlights the performance and cost-efficiency improvements of these systems, which are crucial for making AI more affordable and accessible. Garman stresses that innovation in silicon and software is essential to reduce costs and enable broader AI adoption across industries.

Finally, Garman discusses AWS’s global expansion plans, including new data center regions in Latin America and Europe. He mentions the launch of the Mexico region and upcoming regions in Chile and Brazil, emphasizing the importance of expanding capacity to meet growing customer demand. He also introduces the European Sovereign Cloud, designed specifically for EU-focused workloads that require strict data sovereignty and security. Overall, Garman portrays AWS as committed to providing the most capable, flexible, and secure cloud infrastructure to support the ongoing growth and evolution of AI worldwide.