OpenAI’s ChatGPT Is Learning… From Another AI!

In a recent video, Dr. Károly Zsolnai-Fehér discusses OpenAI’s innovative approach to training AI assistants like ChatGPT, which involves using one AI to coach another, particularly for safety-related queries. This dual-AI training method enhances the assistant’s ability to provide relevant responses while balancing safety and utility, with impressive results that may outperform human evaluators.

In a recent video, Dr. Károly Zsolnai-Fehér discusses a groundbreaking paper from OpenAI that reveals a novel approach to training AI assistants like ChatGPT. The video explains that these AI systems consist of two main components: a knowledge base, represented by a neural network that has absorbed vast amounts of data from various sources, and a coaching mechanism that teaches the AI how to respond effectively to user queries. The knowledge base, referred to as GPT, is capable of understanding and generating language but initially lacks the ability to provide meaningful answers to questions.

The second step in developing a functional AI assistant involves coaching, which is likened to playing a video game. In this phase, the AI generates responses to example questions, and human evaluators score these answers based on their quality. This process helps the AI learn to provide more relevant and precise answers, avoiding vague or overly complex responses. The coaching phase is crucial for transforming the knowledge base into a useful assistant that can engage with users effectively.

OpenAI’s innovative approach, as highlighted in the video, involves using one AI to train another AI, particularly for safety-related questions. This method allows the AI to make informed decisions about when to comply with requests and when to refuse assistance, providing explanations for its choices. The video emphasizes that this dual-AI training system has already been implemented in the ChatGPT that users interact with daily, showcasing its potential for enhancing safety and usefulness.

The results of this new training method are impressive, with the AI reportedly performing as well as, if not better than, human evaluators in determining appropriate responses. The challenge lies in balancing safety and utility; a super-safe AI could simply deny all requests, rendering it ineffective. OpenAI’s approach aims to minimize “bad refusals,” where the AI declines to answer questions unnecessarily, leading to a more functional and safe assistant.

Dr. Zsolnai-Fehér concludes the video by expressing excitement about the advancements in AI training and the implications for future applications. He notes that the research paper includes extensive theoretical and practical details, along with free access to the source code, encouraging viewers to explore the topic further. The video invites viewers to share their thoughts on potential uses for this technology, highlighting the ongoing evolution of AI and its capabilities.