The video reviews a one-year experience with a local AI server rig featuring four NVIDIA RTX 3090 GPUs on an AMD EPYC platform, highlighting the cost-effective and efficient use of consumer-grade components like the Gigabyte B650 motherboard and Ryzen 5 9600X CPU for inference workloads. It emphasizes practical considerations such as cooling, power management, and motherboard choice, concluding that this setup offers excellent value and performance for AI inference tasks while advising viewers to tailor their builds based on specific needs and budgets.
The video provides a comprehensive one-year review of a local AI server rig built with four NVIDIA RTX 3090 GPUs, totaling 96 GB of VRAM. The rig is based on an AMD EPYC platform, which is praised for its suitability in multi-GPU setups. The frame used is an old-school GPU mining frame, which remains an excellent choice for housing four to six GPUs, depending on their size. The reviewer notes some limitations due to modifications made to accommodate a radiator but still highly recommends the mining rig frame for its functionality and ease of use.
A significant focus is placed on the motherboard choice, specifically a Gigabyte B650 Eagle, which is an affordable option with four full-size PCIe slots. This motherboard supports four GPUs and is suitable for inference workloads, where PCIe bandwidth is less critical. The reviewer emphasizes that for inference tasks, PCIe generation and lane width have minimal impact on performance since models are loaded entirely into VRAM. The Ryzen 5 9600X CPU is recommended for its high single-core speed, which significantly affects token processing speed during inference.
The video also discusses the concept of having a lead GPU paired with follower GPUs, especially when mixing different GPU models. For tasks like image and video generation, having a powerful lead GPU such as a 4090 or 5090 is beneficial, while follower GPUs can handle less demanding workloads. However, for heavy image or video generation across multiple GPUs, a server or workstation motherboard with higher PCIe bandwidth and better CPU options like the AMD Threadripper Pro is advised to avoid bottlenecks.
Cooling and power considerations are also addressed. The reviewer suggests upgrading to larger, higher-speed fans for better airflow and noise control, noting that the current silent fans are very quiet but somewhat outdated. Water cooling is deemed unnecessary, with cheaper air cooling options like the Peerless Assassin 120 being sufficient. Power consumption is managed by setting power limits on the GPUs, significantly reducing idle wattage without impacting most inference workloads. However, for intensive image or video generation, higher power limits are necessary to maximize performance.
In conclusion, the reviewer highlights that this quad 3090 rig setup offers excellent value, especially for inference tasks, and could save around $1,250 compared to more expensive configurations. The choice of consumer-grade components like the B650 motherboard, Ryzen 5 CPU, and standard DDR5 RAM simplifies the build and reduces costs. The rig is quieter than typical desktop PCs and performs well within its intended use cases. The video encourages viewers to consider their specific use cases, power requirements, and budget when building a similar AI server rig and invites questions and engagement in the comments.