Defunct companies are increasingly selling their internal digital work data—such as emails, Slack messages, and project records—to AI firms for training models, facilitated by startups like Simple Closure and Sunset, which help monetize these archives while addressing privacy concerns. This emerging market, driven by AI’s need for real-world workplace data beyond publicly available sources, raises significant ethical and legal questions about employee privacy and data use.
The video discusses a new trend in the AI industry where defunct companies sell their accumulated digital work data—such as Slack messages, emails, code, and project management tickets—as training material for AI models. Shana Johnson, former CEO of the transcription company Cello 24, discovered that the company’s extensive digital footprint was a valuable asset when winding down operations. By selling this data through a startup called Simple Closure, she was able to generate significant revenue, helping her close the company’s affairs more smoothly.
This practice is becoming increasingly common as AI labs have exhausted publicly available internet data sources like Reddit, Wikipedia, and digitized books by late 2024. According to experts, these sources are insufficient for training AI models that need to perform real-world workplace tasks. Instead, the detailed records of daily work activities from defunct companies provide rich, practical examples that help build more competent AI agents capable of handling complex tasks in professional environments.
Simple Closure, which assists companies in shutting down, has capitalized on this demand by launching Asset Hub, a platform where companies can sell their internal data archives. The company carefully removes personally identifiable information to address privacy concerns before selling the data to AI firms. Over the past year, Simple Closure has facilitated nearly 100 deals, recovering over $1 million for founders, with prices ranging from $10,000 to $100,000 depending on the company’s size and data richness.
Another player in this market, Sunset, operates similarly, buying defunct company data and pricing it based on factors like company age, size, and the interconnectedness of the data. Certain industries, such as healthcare and finance, command higher prices due to the complexity and value of their data. The market for this kind of workplace data is described as a “gold rush,” reflecting the intense competition among AI companies to acquire real-world datasets to improve their models.
However, this emerging business raises significant privacy concerns. Experts like Mark Roenberg from the Center for AI and Digital Policy warn that even if employees have signed over intellectual property rights, they may not expect their internal communications to be sold and repurposed for AI training. The ethical and legal implications of using such data remain a contentious issue, highlighting the need for careful regulation and transparency as this new frontier in AI development evolves.