In NVIDIA’s GTC 2025 keynote, CEO Jensen Huang highlighted advancements in reasoning models, particularly the new Llama Neotron series, which aim to enhance token inference for various applications. While the 49B model shows promise, the 8B model has received mixed reviews from users regarding its reasoning capabilities, and NVIDIA has also released a substantial dataset to aid developers in training their own models.
In the recent GTC 2025 conference, NVIDIA’s CEO Jensen Huang delivered a keynote that primarily focused on the company’s advancements in data center technology, with a particular emphasis on reasoning models. Unlike previous keynotes that targeted developers, this presentation seemed more directed towards investors. Huang highlighted the increasing importance of reasoning models, explaining how they will significantly enhance the inference of tokens, which can be utilized for various agentic applications. The keynote included a mini video that illustrated the concept of tokens and their relevance in the context of large language models (LLMs).
NVIDIA announced new versions of their Neotron models, specifically the Llama Neotron series, which were initially introduced in January 2023. During the keynote, while Huang did not delve deeply into these models, slides indicated that they are now focusing on reasoning versions of the Llama models. The new models include the Llama 3.3 Neotron Super 49B V1, a distilled version of the larger Llama 3.3 70B model, and the Llama 3.1 Neotron Nano, an 8B version. This approach of using Meta AI’s Llama models instead of developing their own has raised questions, given NVIDIA’s extensive resources and research capabilities.
The 49B and 8B models have undergone various post-training techniques to enhance their reasoning capabilities, drawing inspiration from Deep Seek’s reinforcement learning methods. NVIDIA has also released a substantial post-training dataset, which consists of around 20 million samples generated from multiple models, including Deep Seek R1. This dataset is designed to assist users in training or fine-tuning their own reasoning models, providing a valuable resource for developers interested in this area.
In terms of practical application, NVIDIA has made the models available for testing, allowing users to experiment with reasoning capabilities. The models can generate outputs with or without reasoning, and users can toggle this feature in their prompts. However, initial experiences with the 8B model have been mixed, with some users reporting inconsistencies in the reasoning output. While the model can produce extensive reasoning for certain queries, it may not always do so, leading to frustration among users seeking reliable performance.
Overall, the release of NVIDIA’s reasoning models and the accompanying dataset is seen as a significant step forward in the field of AI. While the 49B model shows promise, the 8B model has not met expectations for some users. The availability of the dataset is a noteworthy contribution, as it allows developers to enhance their own models. The discussion around model sizes and local versus cloud deployment continues, with many users curious about the optimal size for running models effectively. The video concludes with an invitation for viewers to share their thoughts on model sizes and performance.