Anthropic’s Project Vend, computer science education and AI prompts in papers

In this episode of Mixture of Experts, the panel discusses Anthropic’s Project Vend highlighting the need for human oversight in AI deployment, explores the potential and challenges of distributed model training, and addresses the evolving landscape of computer science education amid AI advancements. They also examine ethical concerns around AI use in academic peer review, emphasizing the importance of combining AI capabilities with human creativity, critical thinking, and responsible practices.

In this episode of Mixture of Experts, host Tim Hwang is joined by recurring guests Gabe Goodhart, Marina Danilevsky, and Kush Varshney to discuss several pressing topics in artificial intelligence. They begin by exploring Anthropic’s Project Vend, an experiment where an AI agent named Claudius was tasked with managing a small office vending machine business. The experiment revealed that while the AI could perform some tasks, it made routine mistakes such as poor inventory management and hallucinating payment accounts, ultimately losing money. The panel agrees that while AI agents show promise, successful deployment requires significant scaffolding and guardrails, blending AI creativity with structured human oversight.

The conversation then shifts to a recent paper from China Mobile’s Zero Gravity Labs on distributed model training, specifically training a 107 billion parameter model over a limited bandwidth network. Gabe highlights the societal potential of distributed training, imagining a future where individuals could contribute spare computing power from personal devices to collaboratively train large models, democratizing AI development. Marina adds that this approach could also open new avenues for experimenting with data mixes and privacy, while Kush likens it to collaborative puzzle-solving, emphasizing the importance of efficient communication between distributed parts. The panel notes that despite its promise, distributed training research is under-supported compared to traditional centralized approaches.

Next, the group tackles the challenges of computer science education in the AI era, prompted by a New York Times article discussing a tightening tech job market and the rise of AI code generation tools. Marina stresses that computer science is much more than AI or coding; foundational knowledge like data structures, algorithms, and critical thinking remain essential. Gabe concurs, emphasizing that while AI can automate routine coding tasks, human creativity and system design skills will remain vital. Kush raises concerns about the disconnect between educational goals and industry hiring practices, which often prioritize superficial proxies like GitHub activity, complicating the path for new graduates.

The episode concludes with a discussion on the ethical implications of AI in academic peer review. A recent discovery revealed that some research papers contained hidden prompts designed to manipulate AI-based reviewers into giving positive evaluations. Kush shares his experience as a conference chair, noting the challenges of maintaining review quality and the temptation for reviewers to misuse AI tools despite policies against it. Marina points out that while AI can assist reviewers by summarizing and contextualizing papers, the peer review system itself has structural issues that contribute to these problems. Gabe envisions a future where AI aids reviewers by accelerating background research without replacing human judgment, improving both efficiency and quality.

Overall, the episode provides a nuanced exploration of AI’s evolving role across business automation, model training, education, and academic integrity. The panelists emphasize the importance of combining AI capabilities with human oversight, creativity, and ethical considerations. They highlight both the exciting possibilities and the challenges ahead, advocating for thoughtful integration of AI technologies rather than blind reliance. The discussion underscores that while AI is transforming many domains, foundational skills, critical thinking, and responsible practices remain indispensable in navigating this rapidly changing landscape.