In Day 9 of the “12 Days of OpenAI” event, Olivia Gar introduced the O1 model, which enhances developer capabilities with new features like function calling, structured outputs, and vision inputs, demonstrated through a live demo detecting errors in a tax form. The session also highlighted the Real-Time API for voice experiences, preference fine-tuning for optimizing models based on user feedback, and announced new SDKs for Go and Java, along with a simplified API key sign-up process.
In Day 9 of the “12 Days of OpenAI” event, Olivia Gar, the leader of platform product at OpenAI, introduced a session focused on developers and startups utilizing the OpenAI API. She highlighted the impressive scale of the API, with 2 million developers from over 200 countries. As a gesture of appreciation, several new models and features were announced, including the launch of the O1 model out of preview, which has been in development since September. This model aims to enhance the capabilities of developers by introducing core features that were previously missing.
Michelle Poas and Brian John from the post-training research team detailed the new features of the O1 model, including function calling, structured outputs, and developer messages. Developer messages allow developers to provide specific instructions to the model, enhancing its ability to follow user-defined guidelines. Additionally, a new parameter called reasoning effort was introduced, enabling the model to allocate its processing time more efficiently based on the complexity of the task. The team also announced the inclusion of vision inputs, which will be beneficial for applications in fields like manufacturing and science.
A live demo showcased the O1 model’s capabilities, particularly its ability to detect errors in a tax form using vision inputs. The demo illustrated how the model could identify arithmetic mistakes and incorrect values, demonstrating its practical application in real-world scenarios. The presenters emphasized that while the model can assist in error detection, it should not replace professional judgment. The demo also highlighted the function calling feature, which allows the model to interact with backend APIs seamlessly, providing users with accurate responses without exposing the underlying processes.
The session also covered the Real-Time API, which enables developers to create real-time voice experiences. Sean and Andrew introduced WebRTC support, which simplifies the integration process for developers by handling various technical challenges associated with audio streaming. They demonstrated how easy it is to set up a peer connection for audio communication, significantly reducing the amount of code required compared to previous methods. The presenters expressed excitement about the potential applications developers could create using this new functionality.
Lastly, Andrew introduced a new method of fine-tuning called preference fine-tuning, which allows developers to optimize models based on user preferences rather than exact input-output pairs. This method aims to improve model performance in areas where user feedback is crucial, such as customer support and content moderation. The session concluded with announcements of new SDKs for Go and Java, a simplified API key sign-up process, and the release of recorded talks from previous developer events. An AMA session was also scheduled for attendees to ask questions and engage with the OpenAI team.