Stanford Seminar - Learning to control large teams of robots

The seminar presented two complementary approaches for controlling large teams of robots: a physics-informed, distributed learning framework using self-attention for scalable low-level control, and generative AI techniques combined with reactive collision avoidance for high-level planning of complex multi-robot formations. These methods demonstrate scalability, adaptability, and robustness across various tasks and robot types, paving the way for practical applications like autonomous navigation and coordinated drone shows.

The seminar presented research on learning to control large teams of robots, focusing on two main approaches: low-level distributed control policies and high-level planning for multi-robot systems. The speaker began by highlighting various projects from their lab, including autonomous drone cinematography, distributed tracking and classification, active perception with neural networks, and optimal collision avoidance in dynamic environments. These projects set the stage for the core research questions around modeling complex multi-agent interactions and controlling large robot teams efficiently and scalably.

The first major research direction discussed was the development of scalable, distributed multi-robot control policies using a physics-informed learning framework. The approach leverages port-Hamiltonian systems to model the physical dynamics and energy of multi-agent systems, ensuring explainability and stability. To handle varying numbers of neighbors and local sensing, the team employed self-attention mechanisms inspired by large language models, enabling each robot to process information from a dynamic set of neighbors. The learning framework focuses on closed-loop dynamics, allowing the system to learn from demonstrations without requiring explicit knowledge of individual robot actions.

The researchers demonstrated their approach through imitation learning and reinforcement learning experiments, showing that policies trained on small numbers of agents could generalize to much larger teams. They tested various tasks such as navigation, flocking, coverage, and competitive scenarios, achieving promising results despite some limitations like agents occasionally learning to do nothing in coverage tasks. The framework also proved robot-agnostic, enabling zero-shot transfer from simulated holonomic vehicles to real non-holonomic robots, highlighting its versatility and robustness.

The second research direction addressed high-level planning for large robot swarms, motivated by applications like drone light shows involving thousands of drones forming complex shapes. The team adapted generative AI techniques, specifically diffusion models and conditional flow matching, to generate smooth, collision-free trajectories for large teams. By treating robots as points in 3D space analogous to pixels in images, they trained models to produce desired formations. To ensure feasibility and safety, they combined these generative models with reactive collision avoidance algorithms like ORCA, resulting in smooth trajectories that respect robot dynamics and avoid collisions.

In conclusion, the seminar showcased two complementary approaches to controlling large robot teams: a physics-informed, distributed learning framework for low-level control and a generative AI-based method for high-level planning. Both approaches emphasize scalability, explainability, and adaptability to different robot types and team sizes. The speaker acknowledged current limitations and ongoing research directions, including handling heterogeneous robot teams and multi-task conditioning. The work opens promising avenues for practical applications in robotics, from autonomous navigation to coordinated drone performances, and invites further collaboration and exploration.