New models in the API

OpenAI has launched the GPT 4.1 family of models, including GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano, which offer significant improvements in coding, instruction following, and long context handling, with the ability to manage up to one million tokens. The new models are more cost-effective, with GPT 4.1 priced 26% lower than GPT-4, and are designed to enhance developer experiences and streamline application development.

In a recent announcement, OpenAI introduced the GPT 4.1 family of models, which includes GPT 4.1, GPT 4.1 Mini, and the new GPT 4.1 Nano. These models are specifically designed for developers and boast significant improvements over the previous GPT-4 models, excelling in various areas such as coding, complex instruction following, and long context handling. Notably, all three models can manage up to one million tokens of context, marking a substantial increase from the previous limit of 128K tokens. The naming of these models as 4.1 was intentional, reflecting their enhanced capabilities across the board.

The team highlighted the advancements in coding performance, showcasing how GPT 4.1 has improved its ability to write functional code. Using the SWEBench evaluation, GPT 4.1 achieved a 55% accuracy rate in coding tasks, a significant increase from the 33% accuracy of GPT-4. Additionally, the model has shown improvements in coding across multiple programming languages, demonstrating its versatility. The team also presented a practical example of creating a flashcard app, illustrating how GPT 4.1 can follow complex prompts to produce functional and aesthetically pleasing applications.

Instruction following has also seen notable enhancements, with the new models strictly adhering to user instructions. The team developed an internal evaluation to assess how well the models follow complex instructions, and the results indicated a marked improvement over GPT-4. The models can now handle intricate requests, such as formatting responses in specific ways, without requiring users to provide excessive prompting. This improvement is expected to streamline the development process for users who rely on precise instructions.

The long context capabilities of GPT 4.1 Mini and Nano were emphasized, with the models demonstrating their ability to effectively utilize the expanded context of up to one million tokens. The team showcased evaluations that tested the models’ performance in retrieving information from large documents, revealing that they could accurately locate specific content regardless of its position within the text. This capability is particularly beneficial for applications that require processing extensive datasets or documents.

Finally, the pricing structure for the new models was discussed, with GPT 4.1 being offered at a 26% lower cost compared to GPT-4. The GPT 4.1 Nano model is positioned as the most affordable option at just 12 cents per million tokens, with no additional charges for long context usage. The team also announced plans to deprecate GPT-4.5 in the API over the next three months to allocate resources for the new models. The session concluded with a demonstration of the models’ capabilities and an invitation for developers to start using the new models, emphasizing the potential for innovation and improved user experiences.