Local AI on a Laptop in 2026 (AMD Ryzen AI PRO 128GB)

The video showcases how a laptop with an AMD Ryzen AI Pro chip and 128GB RAM can efficiently run advanced open-source AI models—including language, coding, and vision models—entirely offline, using the Ulama interface for easy management. The presenter highlights the benefits of local AI, such as privacy and independence from the cloud, and demonstrates practical tasks like coding and image analysis, encouraging viewers to explore these capabilities themselves.

The video demonstrates the capabilities of running advanced AI models locally on a laptop equipped with the AMD Ryzen AI Pro chip and 128GB of RAM. The presenter explores how this hardware can handle various open-source AI models, including large language models (LLMs) and vision models, without relying on cloud services. The setup is intended to showcase what users can achieve offline, such as during a flight, by leveraging the power of modern laptop hardware for AI tasks.

The presenter uses the Ulama interface to easily download and run different models, such as GPT OSS 20B, Quen 3 Coder 30B, and Quen 3 VL 8B (a vision model). Performance tests show that the GPT OSS 20B model achieves around 40 tokens per second, which is more than sufficient for reading and coding tasks. Surprisingly, the Quen 3 Coder 30B model runs even faster at about 51 tokens per second, demonstrating the efficiency of these models on the AMD hardware.

For vision tasks, the Quen 3 VL 8B model is tested by analyzing a screenshot of Hacker News headlines. The model accurately extracts the top three headlines from the image, all processed locally and offline. The presenter highlights how this vision model, despite having only 8 billion parameters, performs well for tasks like OCR and basic image understanding, making it suitable for lightweight local applications.

The video also covers agentic workflows using the open-source OpenCode tool, which mimics the functionality of cloud-based coding assistants but runs entirely on the local machine. The presenter demonstrates creating and editing HTML and Python files using OpenCode, noting that while it works well for smaller tasks, context-heavy operations can slow down local models. For more efficient coding, the presenter prefers using the Quen 3 Coder model directly through the Ulama interface, which provides faster results for generating code like a Python snake game.

In conclusion, the presenter emphasizes the benefits of running AI models locally: improved privacy, data security, and independence from cloud providers. The combination of conversational, coding, and vision models covers most offline AI needs, making the AMD Ryzen AI Pro laptop a powerful tool for developers and professionals. The video ends by mentioning AMD’s free loaner program, encouraging viewers to try out these capabilities for themselves and experience the advantages of local AI workflows.