Cover Image for Robotics & World Models Reading Club 06: Evolution of Video World Models for Robotics — San Francisco
Cover Image for Robotics & World Models Reading Club 06: Evolution of Video World Models for Robotics — San Francisco
36 Going

Robotics & World Models Reading Club 06: Evolution of Video World Models for Robotics — San Francisco

Hosted by Junfan Zhu, Aurora Feng & Tongzhou Mu
Register to See Address
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Robotics & World Models Reading Club 06: Evolution of Video World Models for Robotics — San Francisco

​​A high-signal reading group for AI researchers & builders pushing the frontiers of robotic world models, WAMs, and embodied intelligence. In our previous sessions, we brought together researchers and engineers from Boston Dynamics, Google, NVIDIA, Stanford, UC Berkeley, CMU, Dyna, ByteDance, Tesla, and leading Bay Area robotics startups.

​​​Hosted by Junfan Zhu & Aurora Feng.

​​​​Supported by Neural Motion, a universal cross-embodiment data representation layer for embodied AI.


​​Reading Club 06's Core Theme

Evolution of Video World Models for Robotics

Tongzhou Mu (Rhoda AI)

Following the recent surge in world modeling for robotics, this session analyzes the two major frameworks grounding video models in robot control: their use as learned simulators and their integration as direct decision-making policies. We will discuss the trade-offs of these approaches, focusing on the challenge of grounding digital predictions in physical reality. Central to this discussion is a case study of the Direct Video-Action (DVA) model, which enables reliable robot control by reducing control to a problem of real-time video generation.


​​Pre-Readings

  • Causal Video Models Are Data-Efficient Robot Policy Learners (2026)

  • DreamGen: Unlocking Generalization in Robot Learning through Video World Models (2025)

  • V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning (2025)

  • Evaluating Gemini Robotics Policies in a Veo World Simulator (2025)

  • DreamZero: World Action Models are Zero-shot Policies (2026)

  • Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations (2024)

  • Learning Universal Policies via Text-Guided Video Generation (2023)


​​Location

​​​​San Francisco (Downtown)

​​​Date & Time

​​​​Saturday, May 2, 2026 | 2:00 PM – 5:00 PM

​​​​Join Discord Community

​​​​https://discord.gg/WH7DrTHRXK

​​​Follow Saturday Robotics on X

​​​https://x.com/saturdayrobotic


​​Agenda

​​​2:00 PM – 2:30 PM Door Opens & Social

  • ​​​​Food 😋, beverages🧋 and UNLIMITED strawberries 🍓 (our official reading club fruits ☺️😄).

​​​2:30 PM – 3:00 PM Keynote by Tongzhou Mu (Rhoda AI)

https://x.com/tongzhou_mu

​​​​Online access via Zoom: TBD

​​YouTube Recording: TBD (We are looking for recording volunteers)

​​​3:00 PM – 5:00 PM Q&A, ​open-floor roundtable (10–20 min per topic) on spotlight papers or any paper you’d like to highlight. Feel free to share why the paper matters and its technical details.

Past events

​​#⁠reading-club-03-0411

#⁠reading-club-02-0404

​​#⁠reading-club-01-0328

​​​Logistics

​​Spots are limited. Please arrive by 2:00 PM for check-in. Keynote will begin promptly at 2:30 PM.

  • ​​​We currently do not have volunteers available to assist with late check-ins. Given the high volume of inquiries and 100+ attendees (both online and onsite), we kindly ask that you arrive on time to ensure smooth entry.

Location
Please register to see the exact location of this event.
36 Going