I tried Vibe Physics. This is what I learned

artesia · 27 August 2025 13:16

The video examines the capabilities of various AI large language models in tackling the complex Navier-Stokes blowup problem in physics, finding that while models like GPT-5 can understand and discuss the topic to some extent, none are yet capable of producing genuinely new or reliable theoretical insights. The creator concludes that AI currently serves best as a tool for literature review and brainstorming rather than original research, and also briefly promotes a privacy protection service called Incogn.

artesia · 28 August 2025 00:53

In the video, the creator explores the use of AI large language models (LLMs) to generate new physics theories, focusing on a challenging problem in fluid dynamics and general relativity: whether the Navier-Stokes equation exhibits blowups or singularities. The Navier-Stokes equation models fluids and gases, and the millennium problem asks if solutions can develop singularities from regular initial conditions and finite forces. The creator shares a personal, long-held idea involving linking solutions of Einstein’s equations in general relativity to the Navier-Stokes equation to prove blowups using Penrose’s singularity theorem, and tests how well various AI models can engage with this complex topic.

Four AI models are tested: GPT-5, Claude Opus 4.1, Grov 4, and Gemini Pro Ultra Deep Think. GPT-5 shows the best understanding, roughly grasping the problem and suggesting reasonable steps, though it makes some mistakes about the problem’s constraints. Claude Opus 4.1 is the fastest but produces verbose, low-quality responses with fundamental misunderstandings. Grov 4 initially confuses the problem but then offers a simple pseudo-code outline, which is interesting but not practically useful. Gemini models struggle with self-confidence and conceptual errors, often misunderstanding key physics concepts and ultimately concluding the problem is infeasible.

The creator highlights common issues across all models: they frequently confuse similar but distinct physical concepts (e.g., energy vs. free energy), switch notation or topics mid-response, and fail to develop genuinely new ideas. Instead, they tend to generate plausible-sounding but ultimately unproven or incorrect arguments. While the models can admit when they fabricate information, their outputs are often unreliable for novel theoretical work. The creator concludes that current AI models are better suited for literature review, background research, and brainstorming rather than original physics research.

Overall, the verdict is that AI is not yet capable of replacing a good physics student or researcher in developing new theories. The models excel at summarizing and explaining existing knowledge but fall short in abstract mathematical reasoning and creative problem-solving required for cutting-edge physics. Thus, physicists’ jobs remain safe from AI disruption for now. The creator encourages using AI tools cautiously and primarily as aids for gathering and critiquing existing information.

Finally, the video briefly shifts to a sponsor message about Incogn, a service that helps protect personal data privacy by automating requests to remove personal information from data brokers. The creator shares personal experience with privacy issues and recommends Incogn as an easy way to safeguard one’s data online, offering a discount code for viewers. The video ends with thanks and a promise of more content to come.