This OPEN-SOURCE Chip is Faster Than a GPU (And CHEAPER!) | Tenstorrent Chips Explained

Jim Keller’s open-source AI chip, featuring 352 independent RISC-V cores with local SRAM and efficient compiler-managed data movement, outperforms Nvidia’s GPUs in inference tasks while costing significantly less by using cheaper GDDR6 memory and integrated Ethernet for scalable multi-chip systems. Although promising in performance and cost, the chip faces adoption challenges due to software maturity and enterprise reliability requirements, but its innovative design and open-source model pose a strong competitive threat to Nvidia’s AI hardware dominance.

Jim Keller, a legendary chip architect known for his work on the iPhone, PlayStation, and AMD’s comeback, has developed an open-source AI chip that outperforms Nvidia’s best inference systems while costing five times less to operate. Unlike traditional GPUs, Keller’s chip abandons Nvidia’s architecture and software, opting for a completely new design that leverages the predictable nature of AI workloads. By removing hardware schedulers and traffic controllers, the chip relies on a sophisticated compiler to manage data movement, enabling highly efficient and independent operation of its 352 RISC-V cores.

The chip’s architecture is unique in that each core has its own local SRAM, allowing them to work independently without waiting on others. Instead of using expensive and fast HBM memory like Nvidia, Keller’s design uses cheaper GDDR6 memory, which typically has lower bandwidth. However, the compiler prefetches data precisely when needed, effectively hiding the latency and ensuring cores are never idle. This approach works well for smaller models but faces bandwidth limitations with very large models, where Nvidia’s GPUs still hold an advantage.

Scaling to large AI models is addressed by integrating 400 GB/s Ethernet directly into each chip, turning every chip into both a processor and a router. This design allows multiple chips and servers to work together seamlessly as a single unified system, avoiding the communication bottlenecks common in other architectures. In benchmarks, this system delivers comparable performance to Nvidia at a fraction of the cost—around $6 per million tokens versus Nvidia’s $30—making it a highly cost-effective solution for AI inference.

Despite its technical advantages and open-source software, widespread adoption faces challenges. The software, while free and open-source, is still maturing and only supports about 90% of Hugging Face models out of the box. Enterprise customers require near-perfect compatibility and reliability, especially for critical applications like medical imaging and financial fraud detection. Additionally, Jim Keller’s history of leaving projects before they fully mature raises questions about the long-term commitment to this chip’s ecosystem, although this time he is serving as CEO, indicating a deeper involvement.

Finally, Tenstorrent’s chip represents a significant threat to Nvidia’s dominance in AI hardware, but it is not the only challenger. Nvidia has previously acquired competitors to maintain its monopoly, spending $20 billion to eliminate threats. The open-source nature of Tenstorrent’s approach and Keller’s leadership could accelerate innovation and competition in the AI chip market, potentially reshaping the landscape away from Nvidia’s control.