Exploring the new OpenAI Batch API in web and in Python code

The video discusses the new OpenAI Batch API, highlighting its ability to efficiently handle multiple requests simultaneously for tasks like evaluating datasets and generating embeddings. Users can interact with the Batch API through a web UI or Python code, enabling them to upload batch files, monitor job statuses, and retrieve results for text generation tasks with cost efficiency and faster completion times.

The video discusses the new OpenAI Batch API, which is designed for running multiple requests simultaneously, making it ideal for tasks such as evaluating, classifying large datasets, and generating embeddings for a large amount of content. Users can submit up to 50,000 requests to the Batch API for chat completions or embedding completions, with a 50% lower cost compared to regular calls and higher rate limits. The typical turnaround time for processing is around 24 hours, although it often completes faster. This API is particularly useful for processing jobs that do not require immediate responses, offering cost efficiency, higher rate limits, and faster completion times.

The process of creating and running batches can be done directly from the web UI by preparing a batch file in a specific format, including custom IDs and endpoint URLs. Users can select the model they want to use, such as GPT 3.5 Turbo, specify messages, set parameters like max tokens, and create a JSON file for input. Through the web UI, users can upload the input file, select an endpoint (completions or embeddings), and choose the completion window (e.g., 24 hours). The batch is then created, validated, and processed, with completion typically occurring within a few hours to 24 hours.

Alternatively, users can interact with the Batch API directly from Python code by defining a class called “OpenAI Batch Processor” with a method for processing batches. This method takes input parameters such as the input file path, endpoint, and completion window, uploads the file, creates a batch, monitors the batch status, and retrieves the results once the batch is completed. The Python script continually checks the status of the batch job until it is either completed, failed, or cancelled, sleeping for intervals and printing the status during this process.

The benefits of becoming a patron are briefly mentioned, offering access to code files, courses, and one-on-one connections. The video provides a sneak peek into a new master class called “Thousand X Developer,” focusing on coding efficiently using AI tools. Towards the end, a few minutes of coding the batch processing logic in Python is demonstrated, emphasizing the use of the OpenAI Library SDK for interacting with the Batch API. The script is shown to create a batch process that can run for up to 24 hours, continually checking and updating the status until completion.

In conclusion, the video showcases the capabilities of the OpenAI Batch API for efficiently handling a large number of requests simultaneously at a reduced cost and with faster completion times. Users can leverage the web UI or Python code to interact with the API, upload batch files, monitor batch statuses, and retrieve results. The demonstration highlights the practical applications of the Batch API for processing jobs that do not require immediate responses, offering a cost-effective solution with higher rate limits and quicker turnaround times for various text generation tasks.