Meta’s Muse Spark, likely the rebranded Avocado model, marks a cautious but significant step in the company’s AI revival following Llama 4’s underperformance, focusing on internal use with multi-agent capabilities integrated into its social platforms. While Muse Spark shows competitive benchmarks and useful features like calendar integration and image generation, its proprietary nature, limited transparency, and mixed performance highlight both the potential and current limitations of Meta’s AI strategy.
The video discusses Meta’s newly released large language model (LLM) called Muse Spark, which is likely the model previously referred to as Avocado. The release comes nearly a year after the disappointing launch of Llama 4, which failed to meet expectations and was only partially accessible to the public. Following Llama 4’s underwhelming performance, Meta invested heavily—around $14 billion—to revamp its AI efforts by acquiring Scale AI and hiring top talent under the leadership of Alexander Wang. Muse Spark is the first product of this renewed effort, but its release has been somewhat confusing and underwhelming compared to the hype and investment.
Muse Spark is a proprietary model designed primarily for Meta’s internal use across platforms like Facebook and WhatsApp, rather than an open-source model for the broader AI community. Unlike previous Llama models, there is no API or open access, and details about the model’s size, training data, and architecture remain undisclosed. The model supports multi-user agent systems, a capability enhanced by Meta’s acquisition of the startup Manis, which specializes in multi-agent frameworks. This suggests Meta’s strategic focus on integrating AI agents into its social platforms to provide personalized AI experiences for users.
Benchmark results for Muse Spark show mixed performance. While it is competitive and ranks within the top five models in some third-party evaluations, it only outperforms other models on a few benchmarks and falls behind on others, including some created by Scale AI itself. The model is noted for its token efficiency and strong vision capabilities but does not excel in agentic performance. Despite these limitations, the model is considered solid and suitable for Meta’s intended use cases, especially if it can be deployed at scale efficiently and cost-effectively.
The video also highlights Muse Spark’s features, such as integration with calendars and email, image generation capabilities, and access to various tools like Python, OpenCV, and PDF readers. However, some functionalities, like image generation, appear to be unreliable at the moment. The model offers two modes—instant and thinking—with a rumored third mode called contemplating. Unlike competitors, Muse Spark does not provide detailed thinking summaries, which may limit transparency in its reasoning process. Nonetheless, it can spawn agents and perform searches, indicating a functional multi-agent system in development.
In conclusion, Muse Spark represents a significant but cautious step forward for Meta’s AI ambitions. While it is not the open, groundbreaking model many hoped for, it aligns with Meta’s strategy to embed AI deeply into its ecosystem. The video suggests that if Meta can improve training efficiency and continue iterating, Muse Spark or its successors could become more competitive. The community remains curious whether Meta will eventually release open versions or scaled variants of the model, especially as other AI labs continue advancing rapidly. Overall, Muse Spark is seen as an interesting but imperfect release that reflects both the potential and challenges of Meta’s AI resurgence.