Tuhin, founder and CEO of Base10, discusses how his company optimizes AI inference infrastructure to support rapidly growing AI applications by providing scalable, reliable, and cost-effective compute solutions, addressing challenges faced by traditional cloud providers and anticipating massive future demand for GPUs. He highlights Nvidia’s dominance in AI hardware, the importance of open-source models, and the evolving AI ecosystem, emphasizing adaptability and innovation as key for success in the fast-changing AI landscape.
The discussion begins with Tuhin, founder and CEO of Base10, sharing his journey from finance to technology and machine learning, eventually leading to the founding of Base10. The company focuses on production inference, powering some of the fastest-growing AI companies by providing optimized infrastructure for running AI models efficiently and reliably. Base10 works with notable clients like WhisperFlow, a speech-to-text app, and Abridge, a healthcare ambient scribe, helping them manage multiple custom AI models with low latency and high reliability, which is critical for their user experience.
Tuhin explains the strategic advantage Base10 offers over traditional cloud providers like AWS or GCP. While many companies initially try to build their inference stack on these clouds, they often face challenges in optimization, reliability, and multi-cloud flexibility. Base10 provides a developer platform that abstracts these complexities, offering performance, fault tolerance, and security. The company also supports the growing trend of post-training open-source models, which, although currently about 90 days behind frontier models, offer significant cost savings and customization opportunities for scaling AI applications.
The conversation delves into the economics and future of AI inference, highlighting the massive expected growth in inference demand—potentially a billion-fold increase. This surge creates a critical need for scalable, cost-effective compute resources. Base10 currently rents GPUs across multiple cloud providers but anticipates owning hardware in the future to secure access and reduce costs, given the severe scarcity and rising prices of GPUs. The company projects a need for around 150,000 GPUs in two years, translating to approximately $7 billion in compute spend, underscoring the scale and urgency of the infrastructure challenge.
Tuhin also discusses the competitive landscape of AI hardware, emphasizing Nvidia’s dominance due to its mature supply chain, developer ecosystem (CUDA), and performance advantages. While acknowledging emerging competitors and heterogeneous compute architectures, Base10 prioritizes speed and reliability by leveraging Nvidia’s ecosystem. On the software side, inference is seen as a sticky, mission-critical service akin to databases, where customers prefer stable, high-performance platforms to avoid disruptions in their AI-powered products.
Finally, Tuhin shares broader reflections on the AI ecosystem, including the importance of open-source models for national security and innovation, the evolving business models around AI compute, and the potential for modular data centers to standardize and industrialize compute infrastructure. He advises students to focus on areas they find enjoyable and highlights the ongoing rapid changes in AI technology, where new architectures and models continuously reshape the landscape, making adaptability and innovation essential for future success.