Google's voice assistant, Kling 1.5, New top open-source model, AI understands whales

This week in AI news, significant advancements include the release of Cling 1.5, a video generation tool with enhanced realism and a new motion brush feature, and Google’s Gemini voice assistant, which allows for natural conversation on Android devices. Additionally, Alibaba’s Quen 2.5 model outperforms larger models, Google developed an AI to recognize whale sounds, and AI agents in a Minecraft simulation have shown complex social behaviors, highlighting the rapid evolution of AI technology.

This week in AI news, several significant advancements have been made, including the release of Cling 1.5, a powerful video generation tool that now supports 1080p video quality in professional mode. The new version boasts improved prompt following and coherence, making it a top contender in the AI video generation space. Users have reported impressive realism in the generated videos, showcasing human emotions and complex scenes with remarkable detail. Cling 1.5 also introduces a feature called motion brush, allowing users to control the movement of objects within the generated videos, further enhancing creative possibilities.

In addition to Cling 1.5, Google has rolled out its Gemini voice assistant, now available for free to some Android users. This real-time voice assistant allows for natural conversation and offers ten different voice options. Users can interact with Gemini live through voice or text, even when their phones are locked. This rollout positions Google ahead of OpenAI, which has yet to release its advanced voice feature. The assistant’s capabilities have sparked interest, especially with its potential for hands-free use and background operation.

Another noteworthy development is the introduction of Quen 2.5, an open-source AI model from Alibaba that has been recognized as a leading model in the AI space. Despite having only 72 billion parameters, Quen 2.5 outperforms larger models like Llama 3.1 and even competes with GPT-4 in specific benchmarks. Its efficiency and performance make it a viable option for developers, as it can run on high-end consumer hardware without needing internet access. This model supports a wide range of languages and offers a cost-effective alternative to other AI models, making it an attractive choice for various applications.

The video also highlights a new AI model developed by Google that can recognize whale sounds, marking a significant step in understanding whale communication. This AI can classify vocalizations from eight different whale species and has even identified a mysterious sound known as the “biot” as being produced by the elusive Brides whale. This technology could pave the way for further research into animal communication, potentially extending to other species in the future.

Lastly, the video discusses the emergence of AI agents in a Minecraft simulation that have developed their own relationships, even getting married. This experiment showcases the potential for AI to exhibit complex social behaviors and emotional connections. Additionally, YouTube has introduced new AI features for creators, including text-to-video generation and automatic dubbing, aimed at enhancing content creation and accessibility. Overall, the advancements in AI this week demonstrate the rapid evolution of technology and its applications across various fields.