[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)

artesia · 13 April 2024 09:06

Several new AI and machine learning models have been introduced recently, including the Jamba model by AI 22 labs and the CMD-R+ model by Cohere. These models cover a wide range of applications, from text-to-video generation to text-to-3D synthesis, showcasing continuous advancements in the field of artificial intelligence and machine learning.

artesia · 13 April 2024 09:26

In the past two weeks, several new models have been introduced in the field of AI and machine learning. One notable model is the Jamba model by AI 22 labs, which is a hybrid model combining the Mamba architecture with attention layers to achieve long context performance without high memory requirements. Another model, dbrx, is a large open language model with over 100 billion parameters that performs well across various domains like natural language understanding and programming. Additionally, Cohere introduced CMD-R+ model, which is optimized for citations and tool use, available as open weight for personal use but requires payment for commercial use.

Google Research presented models like Video Poet for text-to-video generation and Magic Lens for image retrieval with open-ended instructions. They also released a paper on long-form factuality in large language models, exploring fact verification using llm agents. Snap and Tel Aviv University collaborated on My VLM, a vision language model personalized for user-specific concepts. Nvidia’s Latte 3D model focuses on text-to-3D synthesis, generating high-quality textured meshes quickly for enhanced rendering.

Salesforce AI’s Moai model aims to be a universal forecaster for all types of time series data, providing a single model for diverse forecasting tasks. Other models like Garment 3D Gen by Materiality Labs, Octopus V2 by Nexa AI, and Dolphin 2.8 by Mistol offer specialized functions like garment generation, Android API interaction, and uncensored language model deployment, respectively. The research community continues to evolve models to improve efficiency and performance, as seen in the advancements made by Jet Moe and Quen 1.5e in matching larger models’ performance with fewer activated parameters.

In addition, new data sets and benchmarks have been released, such as the photographic memory evaluation suite by Lamini for tasks requiring exact matching. The ongoing efforts in improving OCR technology are evident in the release of large OCR data sets by Clement Delong. Various models have demonstrated their capabilities in different domains, from language understanding to image generation, showcasing the continuous advancements in the field of artificial intelligence and machine learning. Researchers are exploring innovative approaches like model merging, synthetic data generation, and specialized task optimization to push the boundaries of AI capabilities further.