AI edits videos, full body transfer, insane 3D models, new TTS, Suno v5, new image models - AI NEWS

This week’s AI news highlights groundbreaking tools like Alibaba’s Juan Animate for full-body video transfer, Tencent’s Hunyan 3D 3.0 for ultra-realistic 3D modeling, and advanced text-to-speech models such as Vox CPM and Fire Red TTS, alongside improvements in image generation and editing with Tencent’s SRPO and ByteDance’s Yumu. Additionally, Google integrated Gemini AI into Chrome for enhanced browser-based assistance, while other innovations like Alibaba’s Tong Yi Deep Research and the Wuji robotic hand demonstrate rapid advancements in AI capabilities across creative and practical fields.

This week in AI has been packed with groundbreaking releases and updates across multiple domains, from video editing and 3D modeling to text-to-speech and image generation. Alibaba introduced Juan Animate, a powerful open-source video generator that can transfer full-body movements, facial expressions, and hand gestures from one video to another character seamlessly, preserving backgrounds and lighting. This tool supports various animation styles and creatures, offering unprecedented accuracy and quality compared to existing solutions. Users can run Juan Animate locally, though it requires significant VRAM, with compressed versions in development to improve accessibility.

Another impressive video editing tool is LucyEdit by Deart, which allows users to edit videos using natural language prompts. This free and open-source software enables micro-edits such as changing characters’ appearances or clothing and works offline with a Comfy UI workflow. In 3D modeling, Tencent’s Hunyan 3D 3.0 stands out as the best AI 3D model generator currently available, producing ultra-high-definition models with realistic textures and poses from just one or multiple images. It intelligently predicts unseen parts of characters, making it highly accurate and user-friendly with free credits for new users.

On the image generation front, Tencent released SRPO, an enhanced version of the Flux model that significantly improves realism and aesthetic quality, producing images that look indistinguishable from real photos. Another notable image tool is Yumu by ByteDance, which excels at style and reference transfer, allowing users to generate photos with multiple characters or objects in various settings. Both SRPO and Yumu come with open-source code and Hugging Face spaces for online trials, with workflows available for local use. Additionally, Reeve updated its image editor to allow precise micro-edits of individual objects within images, offering a unique positional control feature that rivals other top image editors like Nano Banana and Cadream.

In text-to-speech and voice cloning, several state-of-the-art models were released. Vox CPM impresses with its ability to clone voices from just a few seconds of audio, including accents and emotions, and can even generate speech in different languages with appropriate accents. Fire Red TTS supports multiple speakers and multilingual output, offering a free Hugging Face space and local installation options. These models provide highly natural and expressive speech synthesis, suitable for various applications. Meanwhile, Suno teased version 5 of their AI music generator, promising more dynamic and human-like vocals soon.

Finally, Google integrated Gemini AI directly into Google Chrome, enabling users to interact with the AI assistant within the browser for tasks like summarizing videos, managing emails, and browsing history. This agentic browser feature can autonomously perform complex tasks such as purchasing items from emails, with plans to expand functionality in the coming months. Other notable AI developments include Alibaba’s Tong Yi Deep Research, an efficient open-source deep research agent rivaling proprietary models, and the Wuji robotic hand, which mimics human dexterity with high precision. Overall, this week’s AI advancements showcase rapid progress in making AI tools more powerful, accessible, and versatile across creative and practical domains.