OpenAI has released GPTOSS, a groundbreaking open-source and open-weight language model available in 120B and 20B parameter versions, offering state-of-the-art performance, efficient deployment on consumer hardware, and strong capabilities in reasoning, coding, and health diagnostics. Released under the Apache 2.0 license with robust safety measures and extensive benchmarking, GPTOSS empowers developers with customizable, transparent AI solutions suitable for local inference and rapid iteration.
OpenAI has released a groundbreaking open-source language model called GPTOSS, available in two sizes: a 120 billion parameter version and a 20 billion parameter version. These models are not only open-source but also open-weight, meaning the actual model weights are publicly available. Released under the permissive Apache 2.0 license, GPTOSS offers significant advantages such as lower cost compared to closed-source Frontier models, the ability to fine-tune for specific use cases, and optimized deployment on consumer hardware. The 120B model runs efficiently on a single 80GB GPU, while the 20B model can operate on edge devices with just 16GB of memory, making them accessible for local inference and rapid iteration without expensive infrastructure.
The models were trained using a combination of reinforcement learning and techniques inspired by OpenAI’s most advanced internal models, including GPT-4 and other Frontier systems. They excel in reasoning, tool use, chain-of-thought processing, and health diagnostics, and support adjustable reasoning depth to balance speed and complexity. Architecturally, GPTOSS uses a transformer design with mixture of experts, activating only a fraction of parameters per token to maximize efficiency. It supports long context lengths up to 128k tokens and employs advanced attention mechanisms for memory and inference efficiency. The training data focused heavily on STEM fields, coding, and general knowledge, tokenized with an enhanced version of OpenAI’s tokenizer.
Benchmark results demonstrate that GPTOSS performs on par with or better than comparable closed-source models. The 120B version scores closely to GPT-3 and GPT-4 Mini on coding competitions, medical benchmarks, and PhD-level science tests, while the 20B version also delivers impressive results despite its smaller size. These models outperform many humans in coding tasks and show strong capabilities in health-related diagnostics. Their performance across a variety of benchmarks highlights the potential for open-source models to match or exceed the capabilities of proprietary systems, signaling a shift in the AI landscape.
On the safety front, OpenAI has taken precautions by filtering harmful training data and avoiding direct supervision of chain-of-thought outputs to prevent hallucinations or harmful content from being exposed to end users. They recommend developers summarize and filter chain-of-thought reasoning before presenting it. Additionally, OpenAI conducted adversarial fine-tuning experiments to test the model’s susceptibility to malicious use in sensitive domains like biology and cybersecurity. These tests showed that even with extensive fine-tuning, the models did not reach dangerous capability levels. To further enhance safety, OpenAI is hosting a $500,000 red-teaming challenge to identify potential vulnerabilities.
Overall, GPTOSS represents a major milestone in open-source AI, combining state-of-the-art performance with accessibility and transparency. Its release is expected to empower enterprises and developers who require secure, private, and customizable AI solutions deployable on-premises or on personal hardware. The models’ efficiency, licensing, and strong benchmark results make them highly attractive for a wide range of applications. OpenAI’s detailed disclosure of training methods and architecture also provides valuable insights for the AI community. Interested users can try out GPTOSS today through Together AI, a partner offering fast and affordable access to these new models.