Every AI Model Thinks the Same (Here's Why)

The video explains that major AI language models, even from different companies, tend to give very similar answers to open-ended questions due to overlapping training data and alignment processes. This growing uniformity raises concerns that AI may stifle creativity by funneling users toward the same limited set of ideas.

The video explores why different AI language models, even those developed by competing companies, tend to give strikingly similar answers to open-ended, creative questions. The creator points out that when you ask models like ChatGPT, Claude, or Gemini to write a metaphor about time, they all gravitate toward the same imagery, such as comparing time to a river. This phenomenon isn’t a coincidence, and it has been a growing concern among AI researchers who suspect that language models are converging on similar outputs despite their apparent differences.

To investigate this, researchers from Carnegie Mellon, the Allen Institute for AI, and Stanford conducted a large-scale study using a dataset called Infinity Chat, which contains 26,000 real questions asked by users to AI chatbots. These questions are open-ended and don’t have a single correct answer, making them ideal for testing creativity. The researchers had 70 different AI models answer the same questions 50 times each, with the randomness (temperature) setting turned up to encourage creative and varied responses.

The results were surprising: even with maximum randomness, individual models produced answers that were over 80% similar to each other in nearly 80% of cases. When comparing answers across different models, the similarity ranged from 71% to 82%. This means that not only do individual models repeat themselves (intra-model repetition), but different models also tend to generate nearly identical responses (inter-model repetition), even when they are built by different organizations and trained on different data.

The video argues that this homogeneity is not a simple technical issue that can be fixed with prompt engineering or tweaking settings. Instead, it is a side effect of how these models are trained, possibly due to overlapping training data, alignment processes that push models toward average responses, contamination from synthetic data, or even cross-training between models from different regions. The concern is that if AI-generated outputs are used as training data for future models, this sameness will only intensify, creating a feedback loop that further narrows the diversity of ideas.

This lack of genuine creativity matters because a significant portion of AI usage involves seeking inspiration, brainstorming, and generating new ideas. If all AI systems funnel users toward the same limited set of responses, they risk stifling human creativity rather than enhancing it. The researchers conclude that while today’s language models are excellent at sounding correct, they may be fundamentally incapable of true creativity. Whether this is a major problem or just a limitation to be aware of remains to be seen, but it’s clear that the “hive mind” effect in AI is real and growing.