Snowflake Arctic 480Bx128 MoE: Mixture of Experts models aren't dead YET!?

Snowflake DB has introduced the Snowflake Arctic 480Bx128 model, featuring 480 billion parameters and 128 experts in a mixture of experts architecture. This model challenges negative assumptions about mixture of experts models with its promising performance, efficiency, and cost-effectiveness, making it a competitive option for Enterprise-grade tasks in the AI landscape.

Snowflake DB, a data engineering and data ops company, has introduced a groundbreaking model called Arctic 480b by 128e. This model boasts 480 billion parameters with 128 experts, built on a mixture of experts architecture. Snowflake Arctic aims to challenge negative assumptions about mixture of experts models, particularly in comparison to models like llama 3. The performance of Snowflake Arctic is promising, especially in terms of efficiency and cost-effectiveness. The model is not yet on the lmis leaderboard, but its unique architecture has sparked interest in the AI community.

Snowflake DB’s move to open source Arctic reflects a growing trend of companies releasing AI models to enable others to leverage their infrastructure. The model’s unique dense Moe hybrid Transformer architecture combines a 10 billion dense Transformer model with a 128x 3.6 billion Moe layer perceptron, resulting in a total of 480 billion parameters. The top two gating mechanism used in Arctic allows the model to activate only a subset of its parameters for each input, reducing computational requirements without sacrificing performance. Snowflake Arctic’s focus on Enterprise-grade tasks like SQL generation and coding sets it apart in the AI space.

The integration of Snowflake Arctic with Nvidia TensorRT LLM enhances inference efficiency, making it a cost-effective solution for Enterprise tasks. The model’s architecture allows for efficient training by overlapping communication and computation, minimizing overhead. Snowflake Arctic’s performance benchmarks put it on par with or better than models like llama 3 8b and llama 3 70b on Enterprise metrics. The model’s active parameters and inference efficiency show promising results, positioning it as a competitive option for AI deployments.

Despite the efficiency gains of Snowflake Arctic, models like Mixl 8X 22b still outperform it in specific tasks, raising questions about the model’s overall effectiveness. The architecture of Snowflake Arctic, with its dense Moe hybrid Transformer, offers a unique approach to combining the benefits of monolithic models and mixture of experts models. The model’s focus on Enterprise intelligence tasks and its ability to optimize inference efficiency make it a compelling option for businesses. Snowflake Arctic’s availability on platforms like Hugging Face and Nvidia API catalog showcases its readiness for deployment.

Overall, Snowflake Arctic presents a significant advancement in mixture of experts models, challenging assumptions and offering a cost-effective solution for Enterprise-grade tasks. The model’s architecture, efficient training methods, and inference performance make it a competitive option in the AI landscape. Snowflake DB’s open-source approach and focus on enabling customers to create high-quality custom models underscore their commitment to driving innovation in the AI space. Snowflake Arctic’s integration with existing data pipelines and its emphasis on Enterprise needs at low cost position it as a valuable tool for businesses looking to leverage AI capabilities.