Cover Image for Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco
Cover Image for Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco
47 Going

Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco

Hosted by Junfan Zhu, Aurora Feng & Ahmet Asarkaya
Register to See Address
San Francisco, California
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco

​A high-signal reading group for AI researchers & builders pushing the frontiers of robotic world models, WAMs, and embodied intelligence. In our previous sessions, we brought together researchers and engineers from Boston Dynamics, Google, NVIDIA, Stanford, UC Berkeley, CMU, Dyna, ByteDance, Tesla, and leading Bay Area robotics startups.

​​Hosted by Junfan Zhu & Aurora Feng.

​​​Supported by Neural Motion, a universal cross-embodiment data representation layer for embodied AI.

Venue provided by Savant, a community for technologists tackling civilizational-scale tech, helping founders navigate the -1 to 0 phase with hardware labs, office hours, and an elite peer network.

​Reading Club 07's Core Theme

From PlaNet to DreamerV4: Building Foundation Models for Control through Dreaming

Recent advances in reinforcement learning point to a shift from learning only through direct environment interaction toward learning through imagination. The Dreamer family embodies this idea by learning a latent world model of the environment and training policies inside that learned simulator.

This talk follows the evolution of the approach from PlaNet, which introduced planning in latent space, to DreamerV1–V3, which replaced search-based planning with actor-critic learning in imagination and scaled the method across diverse domains. It then dives into DreamerV4, which reframes world models as large-scale generative models trained from offline video and action data, combining diffusion transformers with behavior cloning and reinforcement learning.

Beyond the technical details, the talk aims to build an intuitive mental model of learning by dreaming and discuss its implications for robotics and embodied intelligence. We will also ask whether world models are becoming foundation models for control, and examine the remaining challenges of long-horizon consistency, memory, and grounding.


​Pre-Readings

  • 🌍 Foundations of World Models

Ha & Schmidhuber — World Models (2018)

https://arxiv.org/abs/1803.10122

Introduced the core idea of learning a compressed latent simulator for control

Early framing of “agents that dream inside learned environments”

👉 This is where the “dreaming” metaphor for RL/world models first became explicit

  • 🧭 PlaNet — Latent Planning Begins

PlaNet (2018)

https://arxiv.org/abs/1811.04551

First strong demonstration of planning directly in latent space (RSSM)

Uses CEM + MPC instead of learned policies

👉 Learn a world model, then search inside it

  • 🧠 DreamerV1 — From Planning to Learning

Dream to Control (2019)

https://arxiv.org/abs/1912.01603

Replaces search/planning with actor-critic trained entirely in imagination

Keeps PlaNet-style RSSM world model, but changes how behavior is learned

👉 Agents no longer plan—they learn inside dreams

  • ⚙️ DreamerV2 — Stabilizing the Dream

Mastering Atari with Discrete World Models (2020)

https://arxiv.org/abs/2010.02193

Introduces discrete latent states (categorical RSSM)

Adds KL balancing + entropy regularization for stability

👉 Makes imagination-based RL work at scale (Atari)

  • 🚀 DreamerV3 — Scaling the Recipe

Mastering Diverse Domains through World Models (2023)

https://arxiv.org/abs/2301.04104

Unified training recipe across many domains with fixed hyperparameters

Symlog/symexp targets improve stability across reward scales

👉 World models become general-purpose learners across domains

  • 🧱 DreamerV4 — World Models as Foundation Models

Training Agents Inside of Scalable World Models (2025)

https://arxiv.org/abs/2509.24527

Replaces RSSM with diffusion/transformer-based world models

Combines offline pretraining, behavior cloning, and RL (PMPO)

👉 Shift from “RL world model” → “generative foundation model for control”

  • 🔧 Optional Context (Same Research Lineage)

DayDreamer (2022)

https://arxiv.org/abs/2206.14176

Real-world robot learning using Dreamer-style latent imagination

Demonstrates transfer from simulation-style learning to physical systems

Director (2022)

https://arxiv.org/abs/2206.04114

Hierarchical latent planning for long-horizon decision making

Extends Dreamer-style imagination into structured subgoal reasoning

Dynalang (2023)

https://arxiv.org/abs/2308.01399

Adds language conditioning to world models

Bridges perception, language, and action in a unified latent space.


​Location

​​​San Francisco (Downtown)

​​Date & Time

​​​Saturday, May 9, 2026 | 2:00 PM – 5:00 PM

​​​Join Discord Community

​​​https://discord.gg/WH7DrTHRXK

​​Follow Saturday Robotics on X

​​https://x.com/saturdayrobotic


​​Agenda

​​2:00 PM – 2:30 PM Door Opens & Social

  • ​​​Food 😋, beverages🧋 and UNLIMITED strawberries 🍓 (our official reading club fruits ☺️😄).

​​2:30 PM – 3:00 PM Keynote by Ahmet Şemi ASARKAYA (
Agility Robotics)

​​​Online access via Zoom: TBD

​YouTube Recording: TBD (We are looking for recording volunteers)

​​3:00 PM – 5:00 PM Q&A, ​open-floor roundtable (10–20 min per topic) on spotlight papers or any paper you’d like to highlight. Feel free to share why the paper matters and its technical details.


​Past events

#⁠reading-club-03-0411

#⁠reading-club-02-0404

​#⁠reading-club-01-0328

​​Logistics

Spots are limited. Please arrive by 2:00 PM for check-in. Keynote will begin promptly at 2:30 PM.

  • ​​We currently do not have volunteers available to assist with late check-ins. Given the high volume of inquiries and 100+ attendees (both online and onsite), we kindly ask that you arrive on time to ensure smooth entry.

Location
Please register to see the exact location of this event.
San Francisco, California
47 Going