GPT-OSS: NEW OpenAI Update is INSANE (FREE)! 🤯

OpenAI’s new open-source GPT-OSS models, featuring 120B and 20B parameters with flexible Apache 2.0 licenses, offer powerful reasoning capabilities comparable to GPT-3.5 while enabling users to run them locally or on private servers for enhanced privacy and cost control. These models utilize a mixture of experts architecture for efficiency, support long context lengths, and provide an affordable, accessible alternative to proprietary APIs, marking a significant step toward democratizing advanced AI technology.

OpenAI has released a significant open-source update called GPT-OSS, their first since ChatGPT-2, featuring two models: GPT OSS 120B and 20B, with 120 billion and 20 billion parameters respectively. These models come with flexible Apache 2.0 licenses, allowing users to run them on personal servers or laptops, integrate them into applications, and even commercialize them freely. The models are designed for reasoning with chain-of-thought capabilities, performing nearly as well as OpenAI’s GPT-3.5 (03) on benchmarks like Google-proof Q&A, achieving around 80% accuracy compared to 83.3% for GPT-3.5.

Running these models locally is possible but requires substantial hardware. The smaller 20B model needs at least 16GB RAM and a solid GPU, costing roughly $1,200 to $2,000 for a capable laptop. The larger 120B model demands much more powerful setups, costing between $7,000 and $10,000, making it more suitable for server environments rather than personal laptops. This aligns with the model’s intended use for enterprises seeking predictable costs and data privacy by hosting the models on their own infrastructure or cloud services like AWS and Azure, avoiding reliance on third-party APIs that store user data.

The GPT-OSS models utilize a mixture of experts architecture, where queries are routed to specific subsets of the model’s parameters (experts) rather than the entire 120 billion parameters at once. This approach makes running the models more manageable and efficient, with active parameters per token significantly reduced. Although the context length is limited to 128,000 tokens, which is less than some newer models, the performance on coding and reasoning tasks is impressive, often matching or exceeding GPT-3.5 and GPT-4 mini benchmarks, especially in competitive coding and math.

Cost-wise, GPT-OSS models offer a more affordable alternative to OpenAI’s APIs. For example, the 120B model on OpenRouter costs about 9 cents per input and 45 cents per output per million tokens, which is cheaper than GPT-4 mini and drastically less expensive than GPT-3.5. Users can access these models for free with some limitations on platforms like OpenRouter or Hugging Face, or run them locally using tools like llama.cpp or Alpaca, though local runs may strain typical consumer hardware. The video also demonstrated practical usage by coding a simple game and a SaaS landing page, showing the model’s capabilities and some limitations in content generation.

Overall, this open-source release marks a major step toward democratizing access to powerful AI models, enabling businesses and developers to deploy advanced language models privately and cost-effectively. While the models show strong benchmark results and practical utility, real-world testing will determine if they live up to expectations beyond academic scores. The presenter encourages viewers to explore the models, share feedback, and stay tuned for upcoming developments like GPT-5, highlighting the evolving landscape of AI accessibility and innovation.