The video reviews Tachyum’s ambitious but troubled journey to develop a revolutionary AI accelerator combining CPU and AI capabilities, highlighting their architectural updates, significant funding, and bold claims about performance and scalability. Despite progress and renewed efforts, the presenter remains skeptical about the feasibility of Tachyum’s specifications and urges for tangible silicon demonstrations and transparency before fully endorsing the technology.
The video discusses the long and tumultuous journey of Tachyum, a startup aiming to create a revolutionary AI accelerator that combines CPU and AI capabilities. Founded by a successful CEO who previously co-founded Sanforce, Tachyum has been promising groundbreaking technology for nearly seven years but has faced numerous setbacks. Early presentations showcased a design with many small, underpowered cores and ambitious claims about memory and connectivity that raised skepticism. The company also became embroiled in a legal dispute with EDA company Cadence, alleging interference in their chip development due to conflicts of interest, which delayed their progress further.
After settling the legal issues, Tachyum updated its architecture and switched to Synopsys tools, presenting a more conventional out-of-order design similar to mainstream processors. They also announced FPGA emulation and support for multiple instruction sets, including x86, ARM, and RISC-V, alongside their custom ISA. Despite these developments, the company remained largely silent, with limited public demonstrations and mostly marketing efforts. Recently, Tachyum secured a significant funding round of around $250 million, bringing their total funding to about $300 million, with commitments for substantial hardware sales, signaling renewed efforts to tape out silicon for partners.
The latest architectural update reveals a chiplet-based design targeting a 2nm process, likely from TSMC, with each chiplet containing 256 cores and the ability to scale up to 1024 cores per package running at 6 GHz. The core design features unusually large but low-associativity L1 caches and a complex vector and matrix unit setup supporting various data types, including proprietary ones. The cache hierarchy and mesh interconnect resemble IBM’s Telum architecture, with claims of extremely high bandwidth and PCIe 7.0 support. However, the presenter expresses skepticism about some of these claims, particularly regarding cache associativity, branch execution, memory bandwidth amplification, and the feasibility of the mesh handling all the data traffic.
Concerns are also raised about the practicality of Tachyum’s ambitious specifications, such as the enormous package size with over 20,000 pins, the feasibility of using a 2nm process by 2026 given industry constraints, and the extremely high 1600-watt power envelope for the chip. The presenter doubts the ability to achieve 6 GHz clock speeds across so many cores within reasonable power limits and questions the scalability of multi-socket configurations. The discussion highlights the challenges of competing with established players like AMD, Intel, and Nvidia, especially given the lack of publicly available silicon or detailed ISA documentation despite years of promises.
In conclusion, while Tachyum has made some architectural progress and secured substantial funding, the video remains cautious and somewhat skeptical about the company’s claims and timeline. The presenters emphasize the need for tangible silicon demonstrations and transparency before fully endorsing the technology. They invite viewers to share their opinions on whether further coverage of Tachyum is warranted, acknowledging the company’s potential but also the many hurdles it faces in delivering on its ambitious vision.