AI Showdown DeepSeek vs ChatGPT vs Claude vs Gemini

artesia · 4 February 2025 00:54

The video compares the AI language models DeepSeek, ChatGPT, Claude, and Gemini, focusing on their responses to self-assessment, controversial topics, and coding challenges. While DeepSeek shows strengths in coding tasks and offers straightforward answers, the other models tend to provide more sanitized and politically correct responses, highlighting the varying degrees of transparency and capability among them.

artesia · 4 February 2025 01:14

In the video, the creator conducts a comprehensive comparison of various AI language models, specifically focusing on DeepSeek, ChatGPT, Claude, and Gemini. The creator acknowledges a previous oversight in their testing of DeepSeek, where they only used a distilled version of the model instead of the full 671 billion parameter version. To rectify this, they rent a cloud GPU server to run DeepSeek in its full capacity and utilize a chat box program to streamline prompts and responses from all four models for a fair comparison.

The first set of questions posed to the models revolves around their self-assessed intelligence levels. The responses from Gemini, Claude, and ChatGPT are characterized as politically correct and vague, avoiding any definitive claims about their intelligence. In contrast, DeepSeek provides a straightforward answer, identifying itself as an AI assistant without delving into comparisons with human intelligence. This difference highlights the varying degrees of directness and transparency among the models.

The creator then shifts to more controversial topics, such as the Tiananmen Square incident and the comparison of Xi Jinping to Winnie the Pooh. The responses from Gemini, Claude, and ChatGPT exhibit a tendency to avoid directness or provide sanitized answers, while DeepSeek also refrains from giving a clear response, reflecting a bias in handling sensitive political topics. The creator notes that this pattern of filtered responses is common among mainstream AI models, which often prioritize avoiding controversy over providing unfiltered information.

Next, the video transitions to practical coding challenges, specifically generating a Snake game in Rust. Gemini emerges as the only model to produce a working version of the game, albeit with limitations, while Claude and ChatGPT fail to generate functional code. DeepSeek also manages to create a playable version, showcasing its capabilities in coding tasks. The creator emphasizes the importance of AI’s ability to generate code in Rust, given its growing relevance in software development.

In the concluding segments, the creator poses various trivia and practical questions to the models, assessing their accuracy and ability to handle complex queries. The results reveal that DeepSeek performs comparably to the mainstream models, with each model scoring similarly on the trivia questions. The creator reflects on the potential for businesses to adopt DeepSeek for proprietary software development, given the control it offers over data compared to cloud-based models. Overall, the video highlights the strengths and weaknesses of each AI model while emphasizing the growing importance of self-hosted AI solutions.