The video compares OpenAI’s GPT-3.5 and Google’s Gemini 2.5 Pro in their ability to identify locations from challenging GeoGuessr images, with Gemini generally performing slightly better in accuracy and speed. Both models demonstrate impressive skills in analyzing environmental clues, making surprisingly precise guesses in difficult scenarios, showcasing the potential of AI in geographic and visual recognition tasks.
The video features a comparison between OpenAI’s GPT-3.5 (referred to as 03) and Google’s Gemini 2.5 Pro in their ability to identify locations from images in the game GeoGuessr. The creator is inspired by a blog post where Simon Wilson successfully pinpointed a location from a simple window photo. To test the AI models, he selects a particularly difficult map called “Pain and Suffering,” known for its challenging terrain and minimal clues, and proceeds to send screenshots of street view images to both models for their guesses.
Throughout the experiment, the creator emphasizes the difficulty of the map and the importance of not moving or zooming in the images to keep the challenge tough. He inputs the images into both AI models with prompts asking them to do their best to guess the location. The models analyze the images, attempting to identify environmental clues such as vegetation, terrain, and signs, to make their best guesses. The creator then compares the models’ responses and guesses, noting how close or far they are from the actual locations.
The tests reveal that both models can sometimes identify key features like vegetation, terrain, and signs, leading to surprisingly accurate guesses, even in the most difficult scenarios. For example, one guess correctly identified a Ugandan mountain park, and another pinpointed a coastal desert in Peru. The creator highlights how Gemini tends to be faster and sometimes more precise, while GPT-3.5 can struggle with ambiguous images. The competition is close, with both models occasionally winning rounds, but Gemini generally performs slightly better in accuracy.
In some cases, the creator experiments with moving or facing the camera in different directions to provide more clues, which improves the models’ guesses. He also discusses how the models analyze environmental details, such as plant types, road signs, and landscape features, to make educated guesses. The process involves a lot of trial and error, with the creator manually locating the guessed locations on maps to see how close the AI responses are, often within a few hundred kilometers.
In conclusion, the creator finds that both AI models are impressively capable of solving complex GeoGuessr challenges, with Gemini 2.5 Pro slightly edging out GPT-3.5 in accuracy. He explains that these models analyze images by cropping and zooming into details, which helps them identify specific clues. The video ends with encouragement for viewers to try their own image-based location guesses, highlighting the fun and potential of AI in geographic and visual recognition tasks.