This week in AI highlights several innovative tools, including Light of Video for altering video lighting and backgrounds, and Magic 141, a rapid video generator that creates one-minute videos in under a minute. Additionally, Med RX excels in analyzing chest X-rays, while Perplexity’s Deep Research feature compiles comprehensive information for user queries, alongside new music generation tools like Inspire Music and on-device video generation with Sora for smartphones.
This week in AI has been particularly eventful, showcasing a variety of innovative tools and models. One standout is Light of Video, which allows users to change the lighting and background of videos without needing to re-record them. By breaking down videos into individual frames and utilizing a diffusion model, this tool can generate new lighting effects based on user prompts, making it a valuable asset for product commercials and creative projects. The code for this tool is available on GitHub, allowing users to run it locally for free.
Another exciting development is Magic 141, a rapid video generator capable of creating one-minute videos in under a minute. Utilizing a technique called Step Distillation, this model first generates an image from a text prompt and then creates a video from that image, significantly speeding up the process. The results are impressive, with detailed and consistent animations across various styles, including realistic human movements and intricate scenes. The code for Magic 141 is also available for users to experiment with.
In the medical field, Med RX has emerged as a powerful AI assistant for analyzing chest X-ray images. This tool allows users to upload X-ray scans and interact with the AI in a chatbot format, asking technical questions about the images. Med RX has demonstrated superior performance compared to other AI models in analyzing X-rays, and its code and dataset are open-source, making it accessible for further research and development.
Perplexity has introduced a feature called Deep Research, which compiles information from the web to answer user queries comprehensively. This AI agent is designed to gather relevant data and present it in a structured report, making it a useful tool for academic and research purposes. The performance of Deep Research has been benchmarked against other AI models, showing its effectiveness in providing accurate and factual information, which is crucial for users seeking reliable answers.
Lastly, several new video generation tools have been unveiled, including Goku, which can create complex scenes with realistic movements and camera controls. Additionally, Inspire Music by Alibaba allows users to generate music based on text or audio prompts, with plans to expand its capabilities to include lyrics and singing. These advancements, along with the introduction of on-device Sora for local video generation on smartphones, highlight the rapid progress in AI technology, making it increasingly accessible and versatile for various applications.