Apple’s New M5 Max Changes the Local AI Story

The video reviews Apple’s new M5 Max MacBook Pro, highlighting its significant performance improvements for software development and local AI workloads, thanks to a new GPU architecture, higher memory bandwidth, and faster storage. The M5 Max outperforms its predecessor and even rivals the desktop-class M3 Ultra in several real-world tasks, making it a compelling choice for developers and AI practitioners.

The video reviews Apple’s new M5 Max MacBook Pro, focusing on its impact for software development and local AI workloads. The presenter compares the M5 Max to the previous M4 Max MacBook Pro and the powerful M3 Ultra Mac Studio. Apple claims the M5 Max features a new GPU architecture with neural accelerators in every GPU core, offering over four times the peak GPU compute for AI compared to the previous generation, and up to 614 GB/s memory bandwidth. The reviewer aims to test how these improvements translate to real-world tasks, particularly those relevant to developers and AI practitioners.

In terms of CPU performance, the M5 Max shows notable gains. Single-core performance, important for system responsiveness and JavaScript-heavy applications, reaches a new high with a Speedometer 3.1 score of 60.5, surpassing both the M4 Max and M3 Ultra. Multi-core performance, which affects tasks like code compilation and parallel workloads, also improves significantly. Using the Mandelbrot algorithm in Python, the M5 Max completes the task in about 11.7 seconds, compared to 14.6 seconds on the M4 Max. While the M3 Ultra remains faster due to its 32 cores, the M5 Max narrows the gap, especially considering it’s a laptop versus a desktop.

A major architectural change in the M5 Max is the shift from efficiency and performance cores to a new configuration of six “super” cores and twelve “performance” cores, eliminating efficiency cores. This change, along with the increased core count, boosts multi-core performance. The M5 Max also features a much faster SSD, with sequential read speeds nearly doubling those of the M4 Max and M3 Ultra, reaching almost 14,000 MB/s. This improvement benefits tasks like loading large AI models, caching, and code compilation, as well as general system responsiveness.

For local AI workloads, the video examines storage speed, prompt processing (PP), and token generation (TG) using large language models (LLMs). The M5 Max’s memory bandwidth is higher than the M4 Max and even slightly surpasses the M3 Ultra in sustained memory throughput, as measured by the Stream Triad benchmark. In practical LLM tests using LM Studio and Llama CPP, the M5 Max demonstrates a significant leap in prompt processing speed—over four times faster than the M4 Max and even beating the M3 Ultra. Token generation speed also improves, though the gains are more modest and model-dependent.

Overall, the M5 Max MacBook Pro delivers substantial improvements for developers and AI users, especially in tasks that leverage its enhanced GPU and memory bandwidth. The laptop’s performance in both traditional software development and local AI inference brings it surprisingly close to the desktop-class M3 Ultra in some areas. The reviewer is impressed by the real-world impact of Apple’s architectural changes and looks forward to further testing, particularly with the upcoming M5 Ultra. The video concludes by inviting viewers to suggest additional tests and stay tuned for more in-depth comparisons.