OpenAI has released two new open-weight language models, GPT-OSS 120B and 20B, licensed under Apache 2.0, enabling broad usage but lacking full open-source transparency such as training code and datasets. These models feature efficient mixture-of-expert architectures, support advanced agentic workflows, and perform well on benchmarks, marking a significant step for accessible AI development despite some limitations compared to proprietary models.
OpenAI has released two new open-weight language models, the GPT-OSS 120B and 20B, marking a significant return to open model releases since GPT-2. These models come with an Apache 2.0 license, allowing broad usage without restrictive conditions, which is a positive move compared to previous proprietary limitations. However, the presenter critiques the “OSS” (open-source series) label, arguing that these are open-weight rather than fully open-source models, as they lack access to training code, base models, and datasets that would make them fully reproducible.
The 120B and 20B models are designed with different use cases in mind: the larger model targets cloud deployment with substantial GPU resources, while the smaller 20B model is optimized for local use on personal computers via platforms like Olama and LM Studio. Both models support advanced agentic workflows, including instruction following, tool use, web search, Python code execution, and multi-level reasoning (low, medium, high), which can be controlled via system prompts to balance latency and performance. This flexibility aims to unlock new possibilities for developers and researchers working with local and cloud-based AI agents.
Technically, these models employ mixture-of-expert (MoE) architectures, with the 120B model activating about 5 billion parameters and the 20B model about 3.6 billion, reflecting a trend toward efficient parameter usage seen in other large models like Quen. They use rotary positional embeddings supporting up to 128K context length, though initial training likely capped at 32K tokens. The models are primarily English-focused, with no significant multilingual capabilities at this stage. Post-training aligns with OpenAI’s previous approaches, involving supervised fine-tuning and reinforcement learning, but detailed training specifics remain undisclosed.
Benchmark results show these models perform well, especially with tool use, and sometimes the 20B model outperforms older GPT-3 models, though the presenter cautions about potential overfitting on benchmarks. The models excel in function calling and reasoning tasks, with longer chain-of-thought reasoning improving accuracy. Early user experiences via platforms like OpenRouter and Olama demonstrate that the models produce detailed, table-rich responses with a distinct OpenAI personality, though the knowledge cutoff is around mid-2024, limiting up-to-date information. Running the models locally requires specific setups like Triton for efficient quantization.
Overall, the release is seen as a positive step forward for open AI models, pressuring other labs to release more open weights. While not state-of-the-art compared to proprietary models, these releases provide valuable tools for agentic AI development both locally and in the cloud. The presenter anticipates further testing and exploration, especially around agent frameworks and code generation, and looks forward to how these open models will compare to the upcoming GPT-5 launch. The community is encouraged to share their experiences and use cases to better understand the models’ strengths and limitations.