Generalist, a Silicon Valley startup, is developing advanced AI-driven robot “brains” using large-scale training data collected via innovative wearable “data hands,” enabling robots to improvise and perform complex, high-dexterity tasks like folding laundry and packing items with human-like adaptability. Backed by major investors and inspired by the transformative impact of models like GPT-3, Generalist aims to overcome robotics’ data scarcity challenges and usher in a breakthrough “ChatGPT moment” for robotics, fundamentally enhancing robots’ real-world capabilities.
Generalist, a Silicon Valley startup, is pioneering a new approach to robotics by developing advanced robot “brains” that enable robots to improvise and handle complex tasks with dexterity. Their recent demonstration involved a pair of robotic arms tasked with placing plush toys into plastic bags. When a toy got stuck halfway, the robot unexpectedly used its other arm to shake the bag and successfully complete the task, showcasing emergent behavior beyond its programming. This improvisation signals a potential breakthrough in robotics, akin to the transformative impact ChatGPT had on natural language processing.
Founded by Pete Florence, Andy Zeng, and Andy Barry, Generalist has raised $140 million at a $440 million valuation, backed by prominent investors like Spark Capital, Nvidia Ventures, and Bezos Expeditions. The company is launching a new model called Gen 1, designed to help off-the-shelf robots perform a wider range of high-dexterity tasks such as folding laundry and packing diverse items. Florence compares this moment in robotics to the release of GPT-3, emphasizing the importance of building large models trained on vast data to enable improvisational intelligence in robots.
Despite the excitement around humanoid robots performing impressive feats, most real-world robots still struggle with unpredictable, everyday tasks without human supervision. Generalist’s approach aligns with competitors like Physical Intelligence, who combine off-the-shelf hardware with transformer-based AI models. However, a major challenge in robotics remains the scarcity of training data, as there is no comprehensive dataset equivalent to the internet for physical tasks. Teleoperation, where humans remotely control robots to generate data, is a common but resource-intensive solution.
Generalist’s innovative solution to the data bottleneck is their “data hands” technology—wearable devices that transform human hands into robotic pincers, capturing visual and sensory data during real-world manipulation tasks. This method allows for scalable data collection in diverse environments such as homes and warehouses. The company has amassed over half a million hours of training data, enabling their models to generalize across tasks rather than simply memorize them, resulting in robots that can perform tasks like folding boxes nearly as fast as humans.
While the hardware remains basic, with simple pincer-style grippers lacking the finesse of human hands, the progress in AI-driven control is promising. Investors like Fraser Kelton from Spark Capital, who has experience with OpenAI’s GPT models, highlight that early language models were initially underestimated but scaling led to significant improvements in generalization. Generalist’s work suggests that a similar scaling approach in robotics could usher in a “ChatGPT moment,” fundamentally changing how robots learn and operate in the real world.