The YC Paper Club’s inaugural session brought together AI founders and researchers to discuss cutting-edge papers on topics including speculative decoding for faster language model inference, diffusion model predictive control in robotics, and world models for environment dynamics prediction. The event also covered theoretical insights into deep learning generalization and data-efficient scaling laws, fostering a collaborative community focused on advancing AI research and innovation.
The YC Paper Club held its inaugural session with an enthusiastic and highly accomplished group of founders and researchers, aiming to foster a vibrant community around AI research and innovation. The event highlighted the unique opportunity to leverage the YC environment, especially in Palo Alto, to connect AI talent across the Bay Area. The club’s mission is to create a collaborative space where cutting-edge research papers are presented and discussed, starting with five selected papers covering topics from inference algorithms to world models and scaling laws in deep learning.
The first presentation by Tanishk from Stanford focused on speculative decoding (SSD), an advanced inference technique designed to speed up token generation in large language models. He explained how SSD improves upon traditional speculative decoding by parallelizing the drafting and verification steps, which are usually sequential and bottleneck inference speed. By predicting verification outcomes ahead of time and drafting multiple token sequences in parallel, SSD achieves significant speedups in generating tokens, making inference not just a cost or convenience factor but a core capability that can enhance model intelligence.
Next, Stannis from Google DeepMind discussed diffusion model predictive control (DMPC) for robotics, which leverages diffusion models to improve multi-step action proposals and dynamics modeling. This approach reduces compounding errors common in model predictive control and simplifies planning algorithms, enabling better adaptation to new reward functions and dynamics at runtime. The talk also positioned DMPC within a broader landscape of diffusion-based agents, highlighting its advantages in flexibility and performance on robotic control tasks.
Isaac Ward then presented on world models, emphasizing their role in learning and predicting the dynamics of environments to enable model-based control. He traced the concept back to foundational work from the 1990s and explained how modern world models use latent space representations to efficiently predict future states from high-dimensional observations. A key contribution discussed was the SIGG regularizer, which promotes healthy latent embeddings to avoid collapse during training. World models offer powerful capabilities such as open-loop prediction, adaptability to novel dynamics, and uncertainty quantification, which are crucial for robust real-world AI applications.
The final talks addressed broader theoretical and practical challenges in deep learning. Ashe from QABs presented Andrew Gordon Wilson’s work on demystifying deep learning generalization through classical PAC-Bayes theory, explaining phenomena like overparameterization and benign overfitting with established statistical frameworks. Ku then discussed scaling laws under data constraints, showing that aggressive regularization, ensembling, and distillation can significantly improve data efficiency when compute is abundant but data is limited. These insights suggest new directions for optimizing pre-training strategies to achieve better generalization and efficiency in large-scale AI models. The session concluded with an invitation to continue building the YC Paper Club community and enjoy some boba tea.