This week’s AI news features groundbreaking open-source models like Tencent’s Hunyen OCR for complex text recognition, Microsoft’s Farra 7B for autonomous computer control, and DeepSeek Math V2 achieving top-tier math competition performance, alongside advancements in robotics and image generation. Additionally, Anthropic’s Claude Opus 4.5 excels in coding tasks, Alibaba’s Zimage offers efficient image editing, and OpenAI introduces a new ChatGPT shopping research feature, collectively showcasing significant strides in AI capabilities and accessibility.
This week in AI news has been packed with exciting developments across various domains. Tencent released Hunyen OCR, a powerful yet compact AI model capable of accurately parsing complex text from images, including tables, invoices, charts, chemical formulas, and even challenging handwriting. Despite having only 1 billion parameters, it outperforms much larger proprietary models like Gemini 2.5 Pro and GPT-4. The model and its code are open-source, allowing users with a CUDA GPU and sufficient VRAM to run it locally. Another notable release is Geo Vista, an AI agent designed to autonomously determine the location of any photo by analyzing visual clues, text, and performing web searches. This 7 billion parameter model is open-source and offers top-tier performance among open alternatives.
Microsoft introduced Farra 7B, a tiny 7 billion parameter open-source AI agent that can autonomously operate a computer by controlling the mouse and keyboard, effectively mimicking human interaction. It can perform tasks such as shopping, booking travel, and summarizing online content, all while running efficiently on consumer-grade hardware. This model emphasizes privacy by allowing offline use and is highly cost-effective compared to other computer-use agents. In robotics, the Ry VLA2 model combines vision, language, and action to control robots in performing household tasks with impressive adaptability, even under challenging conditions. Additionally, the affordable and open-source Aloha Mini robot can be 3D printed and assembled at home for around $600, capable of teleoperated and autonomous household chores.
In the realm of image generation, Flux 2 was released with capabilities for creating highly realistic images up to 4 megapixels and editing existing photos. However, its open-source version, Flux 2 Dev, is large and less impressive compared to competitors like Alibaba’s Quen Image. Conversely, Alibaba’s Tongi Lab launched Zimage, a compact 6 billion parameter open-source image generator and editor that produces high-quality, realistic images quickly and efficiently on consumer GPUs. Zimage supports advanced editing features and is poised to become a leading open-source tool in this space. Another versatile image model, I Montage, allows multi-image input and output with control over composition, style, and perspective, supporting complex editing and consistent multi-image generation.
DeepSeek made headlines again with DeepSeek Math V2, an open-source AI model achieving gold medal-level performance on some of the world’s toughest math competitions, including the International Math Olympiad and Canadian Math Olympiad. This model uses a novel training approach that rewards correct step-by-step reasoning rather than just final answers, enabling it to solve complex mathematical proofs with high accuracy. It outperforms many proprietary models and is available under a permissive Apache 2 license. Meanwhile, Anthropic released Claude Opus 4.5, a model optimized for coding and agentic tasks. While it excels in software engineering benchmarks and robustness, it is more expensive and less versatile than competitors like Google’s Gemini 3 Pro, which remains the top overall model.
Finally, OpenAI quietly launched a new shopping research feature in ChatGPT, available even to free users. This autonomous agent helps users find the best products by conducting comprehensive online research, asking clarifying questions, and delivering personalized buyer guides with detailed comparisons and trade-offs. The feature is powered by a specialized GPT-5 Mini model fine-tuned for shopping tasks, ensuring privacy and high-quality results from trusted sources. Overall, this week’s AI news highlights significant advances in OCR, autonomous agents, robotics, image generation, mathematical reasoning, and practical AI applications, with many open-source releases empowering users to experiment and innovate locally.