NEW Open-Source LLM Tops The Rankings...But Is It Actually Good?

merefield · 9 April 2024 15:01

A new foundational model called Command R and its advanced version, Command R Plus, have been introduced, specializing in Rag and Tool use for large-scale production workloads. Command R demonstrates superior performance in real-world use cases like document assistance and workplace support, excelling in tasks like long context retrieval but facing challenges in complex reasoning and mathematical computations.

merefield · 9 April 2024 15:01

A new foundational model called Coh’s Command R has been introduced recently, with an even more advanced version called Command R Plus. This model is specialized for Rag and Tool use, making it great for various agents. Command R is aimed at large-scale production workloads, offering high efficiency, strong accuracy, and the ability to move companies beyond proof of concept and into production. It boasts strong accuracy in Rag and Tool use, low latency, high throughput, support for 10 key languages, and competitive pricing.

In benchmark comparisons with other models like MixL, Command R demonstrates superior performance, particularly in real-world use cases like document assistance, consumer support bots, and workplace assistance. It also excels in tool usage scenarios. The model performs exceptionally well in the “needle in the haystack” test for long context retrieval, showcasing almost perfect accuracy. Command R’s pricing is competitive, with costs of $1 per million for input and $2 per million for output.

The text transitions to testing Command R Plus on Coheer Coral, a platform offering chat, web search, and document functionalities. The testing involves tasks like writing Python scripts, solving logic and reasoning problems, and generating JSON structures. While Command R Plus excels in some tasks, such as providing detailed responses to web search queries and creating JSON structures accurately, it struggles with logic and reasoning problems and mathematical calculations.

The evaluation highlights both successes and failures of Command R Plus in different test scenarios. While the model shows strength in web search and document retrieval tasks, it falls short in complex reasoning tasks and accurate mathematical computations. Despite some limitations in specific test cases, Command R Plus is deemed suitable for enterprise use due to its strong web search capabilities and potential for document retrieval. The overall assessment suggests that the model may excel in the majority of use cases encountered in an enterprise environment, making it a valuable tool for various applications.