Running Local AI on AMD

artesia · 26 May 2026 16:00

The video highlights the growing viability of running local AI on AMD hardware, showcasing a powerful Ryzen Threadripper and Radeon AI Pro setup that efficiently handles large language models, generative tasks, and training workloads using AMD’s ROCOM platform and compatible software tools. It emphasizes the benefits of local AI for cost, privacy, and control, demonstrating that AMD’s matured ecosystem now supports seamless AI development and real-time applications across various domains.

artesia · 26 May 2026 16:21

The video emphasizes the growing importance of local AI in the future of artificial intelligence, highlighting how open-weight models have significantly closed the performance gap with frontier models, now trailing by only a few months. The rising costs associated with token usage in cloud-based AI services, especially for agentic and reasoning tasks, make local AI solutions increasingly attractive. Additionally, privacy concerns and the desire for greater control over AI usage are driving interest in running AI workloads on personal hardware rather than relying solely on expensive cloud services.

The presenter showcases an AMD-based workstation equipped with a Ryzen Threadripper 9980X CPU and an AMD Radeon AI Pro R9700 GPU with 32GB of VRAM. This setup demonstrates impressive performance running local large language models (LLMs) such as Quen 3.6 and Gemma, achieving fast token response rates suitable for real-time applications like coding agents. Tools like LM Studio and Olama are used to run these models smoothly, with minimal compromises on quantization thanks to the ample VRAM, enabling flexible context window sizes and on-the-fly document processing.

A key enabler for this performance is AMD’s Radian Open Compute Platform (ROCOM), which provides robust software support for deep learning frameworks including PyTorch. Unlike in the past, ROCOM now offers seamless compatibility, allowing users to run and even fine-tune models locally without major issues. The presenter highlights that ROCOM supports not only inference but also training workloads, with official ROCOM-optimized PyTorch builds and comprehensive documentation, making AMD GPUs a viable option for AI development beyond just inference.

Beyond language models, the system also excels at generative tasks such as image and video creation using tools like Comfy UI, which supports ROCOM for GPU acceleration. The presenter demonstrates generating various images and videos, noting the system’s ability to handle popular models efficiently. This versatility extends to audio and 3D model generation, showcasing the workstation’s capability as a comprehensive AI content creation platform.

Finally, the video underscores the advantages of running Linux on the AMD system to fully leverage ROCOM’s capabilities, including training models and running AI applications with frameworks like transformers and VLLM. The presenter shares examples of training a simple ResNet model and running the Gemma 4 model locally with a Gradio interface, illustrating the practical usability of the setup. Overall, the video concludes that AMD’s hardware and software ecosystem has matured significantly, making local AI a realistic and powerful option for a wide range of AI workloads, encouraging viewers to explore this path alongside cloud-based solutions.