Mistral has made a strong comeback with the release of four new Mistral 3 models, including the powerful 675 billion parameter Mistral Large 3 MoE model and three smaller Mini Mistral 3 dense models, offering a versatile range for various AI applications. These models emphasize openness, flexibility, and competitive performance within the open-source AI community, particularly in Europe, and are poised to foster innovation with upcoming reasoning versions and broad developer accessibility.
Mistral has made a significant return to the AI model scene after a relatively quiet period, releasing four new models under the Mistral 3 series. This marks a notable comeback, especially considering their last major non-update release was about five months ago, which is a considerable gap in the fast-moving AI landscape. The main highlight is the Mistral Large 3, a 675 billion parameter mixture of experts (MoE) model, which stands out because Mistral has historically released open versions of smaller models but not their largest ones. This model activates 41 billion parameters at a time, a much higher active parameter count compared to recent models like GPT-4 or those from the Quen series, indicating a potentially powerful performance edge.
Alongside the large model, Mistral has introduced three smaller dense models called Mini Mistral 3, which seem to replace their previous small models. These smaller models come in base, instruction-tuned, and reasoning versions, providing a comprehensive suite for different use cases. The release of base models is particularly important as it allows researchers and developers to fine-tune and experiment with the models themselves, fostering innovation and customization. This approach contrasts with many other companies that often release only the fine-tuned or instruction versions, limiting flexibility for users.
Benchmark comparisons show that Mistral Large 3 competes closely with models like DeepSeek 3.1 and Kimmy K2, though it ranks lower on broader LLM arena scores, reflecting the intense competition from well-funded giants like OpenAI, Google, and Anthropic. However, among open models with permissive licenses like Apache 2, Mistral Large 3 ranks highly, outperforming several Quen 3 models and only narrowly trailing behind the largest MoE model from Quen. The smaller Mini Mistral 3 models also perform impressively, with the 3 billion parameter model matching older 12 billion parameter models from other teams, highlighting their efficiency and relevance in the smaller model space.
Mistral’s renewed focus on releasing a range of model sizes is notable because many other companies have concentrated on singular large models, often neglecting smaller ones that are crucial for edge devices and less resource-intensive applications. The availability of 3B, 8B, and 14B parameter models fills an important niche and continues Mistral’s tradition of providing accessible, open models that can be used for a variety of applications. The upcoming reasoning version of Mistral Large 3 is highly anticipated and could further enhance their standing by offering advanced capabilities comparable to the latest models from leading AI labs.
In summary, Mistral’s new releases reaffirm their relevance in the open-source AI community, especially in Europe, where there is interest in having independent, locally developed models. While they may not match the scale or resources of the biggest AI companies, their models offer solid performance, openness, and flexibility. The release of base, instruction-tuned, and reasoning versions across multiple sizes, along with GGUF quantized versions, provides a versatile toolkit for developers and researchers. This positions Mistral as a key player in the open model ecosystem, and it will be interesting to see how their reasoning model compares once released. The community is encouraged to experiment with these models and share their experiences to help shape the future of open AI development.