The Hidden Reason ChatGPT Feels Worse Than Before

artesia · 20 December 2025 07:17

The video explains that GPT-5.2, despite being advanced, disappoints many users due to overfitting and inconsistent performance across real-world tasks and benchmarks, partly because it was prematurely released under competitive pressure from Google. OpenAI plans significant improvements in 2026 to address these issues and deliver a more balanced, practical AI experience beyond just high benchmark scores.

artesia · 20 December 2025 07:38

The video discusses the widespread dissatisfaction among users, especially power users, with OpenAI’s recently released GPT-5.2 model. Despite being touted by OpenAI’s Greg Brockman as the most advanced frontier model for professional work and long-running agents, many users feel that GPT-5.2 falls short of expectations. The release came shortly after Google’s Gemini 3, which impressed many and led some users to switch from ChatGPT to Google’s AI offerings. While GPT-5.2 performs well on certain benchmarks, it has been criticized for inconsistency and underperformance on others, leading to a vocal minority expressing strong disappointment.

One key issue highlighted is the inconsistency of GPT-5.2’s performance across various benchmarks. For example, on the Simple Bench benchmark, which tests real-world understanding and reasoning through tricky questions, GPT-5.2 ranked poorly compared to competitors like Gemini 3 and earlier OpenAI models. However, on the EQ Bench, which measures emotional intelligence and conversational ability, GPT-5.2 performed relatively well, ranking third. This disparity has confused users and analysts, suggesting that the model excels in some areas but struggles significantly in others, particularly in nuanced reasoning and real-world application.

A major theory for GPT-5.2’s shortcomings is overfitting, where the model has been trained too heavily on specific benchmark tests, causing it to memorize answers rather than truly understand concepts. This leads to excellent benchmark scores on familiar tests but poor performance on new, unseen data or real-world tasks. The video compares this to a student who memorizes last year’s exam answers but fails when questions change slightly. Overfitting results in a model that appears smart in controlled tests but lacks the generalization and adaptability needed for practical use, which frustrates users expecting more robust AI capabilities.

The video also touches on the competitive pressure OpenAI faced from Google, which may have led to the premature release of GPT-5.2. According to reports, OpenAI released an early checkpoint of the model rather than the fully polished version, aiming to maintain its leadership in the AI space amid rising competition. This rushed release likely contributed to the model’s uneven performance and user dissatisfaction. Despite this, OpenAI plans to improve the model significantly in early 2026, with updates targeting both enterprise and consumer needs, aiming to deliver a more balanced and capable AI experience.

In conclusion, while GPT-5.2 is a technically advanced model, its current iteration suffers from overfitting and inconsistent real-world performance, leading to disappointment among power users. The model was released under competitive pressure and is not the final version OpenAI intends to offer. Future updates expected in 2026 aim to address these issues, improving both the model’s intelligence and usability. The video emphasizes that benchmark scores do not always translate to practical effectiveness, and true AI progress requires models that generalize well beyond standardized tests.