This AI is a little bit *TOO* good

Matt Schumer, CEO of Hyperr AI, faced backlash after claiming his new open-source model, Reflection 70B, outperformed competitors like GPT-4, but users found its performance lacking and raised concerns about his undisclosed investment in Glaive AI. As skepticism grew, Schumer’s attempts to clarify the situation only fueled further doubt, leading to accusations of deception and a significant erosion of trust within the AI community.

In a recent controversy surrounding Matt Schumer, the CEO and founder of Hyperr AI, the announcement of his new open-source model, Reflection 70B, has sparked significant debate within the AI community. Schumer claimed that this model outperformed major competitors like GPT-4 and Llama 3.1, attributing its success to a novel technique called reflection tuning. However, skepticism arose when the community began testing the model and found that it did not perform as well as advertised, leading to questions about the validity of the benchmarks Schumer presented.

The initial excitement around Reflection 70B quickly turned to doubt as users reported poor performance when testing the model against established benchmarks. Schumer’s claims were further complicated by his failure to disclose his investment in Glaive AI, a company that played a role in the model’s development. Critics pointed out that this lack of transparency raised concerns about potential conflicts of interest and the integrity of the results being shared. Despite his previous good reputation in the AI community, the situation began to unravel as more people scrutinized his claims.

As the controversy deepened, Schumer attempted to clarify the situation by stating that the model uploaded to Hugging Face was not the correct version, suggesting that a mix-up had occurred during the upload process. He offered a private API key for users to test the model, claiming that it would provide a more accurate representation of its capabilities. However, this only fueled further skepticism, as many users found that the private API was performing significantly better than the public version, leading to accusations of deception.

The community’s investigation revealed that the Reflection 70B model appeared to be based on Llama 3, with some suggesting that it was not the innovative model Schumer claimed it to be. Additionally, there were allegations that the private API was actually using Anthropic’s Claude model, raising further questions about the authenticity of Schumer’s claims. As the situation escalated, both Schumer and his collaborator from Glaive AI issued apologies, acknowledging the confusion and promising to investigate the discrepancies in performance and benchmarks.

Ultimately, the fallout from this incident has left many in the AI community feeling betrayed and skeptical of Schumer’s integrity. The controversy highlights the challenges of accountability in the rapidly evolving AI landscape, where sensational claims can lead to significant hype and attention. As the investigation continues, the trust that Schumer had built over time appears to be eroding, leaving many to wonder about the true capabilities of Reflection 70B and the motivations behind its promotion.