

Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco
Robotics & World Models Reading Club 07: Learning to Dream: World Models, Imagination, Path to Foundation Models for Control — San Francisco
A high-signal reading group for AI researchers & builders pushing the frontiers of robotic world models, WAMs, and embodied intelligence. In our previous sessions, we brought together researchers and engineers from Boston Dynamics, Google, NVIDIA, Stanford, UC Berkeley, CMU, Dyna, ByteDance, Tesla, and leading Bay Area robotics startups.
Hosted by Junfan Zhu & Aurora Feng.
Supported by Neural Motion, a universal cross-embodiment data representation layer for embodied AI.
Venue provided by Savant, a community for technologists tackling civilizational-scale tech, helping founders navigate the -1 to 0 phase with hardware labs, office hours, and an elite peer network.
Reading Club 07's Core Theme
From PlaNet to DreamerV4: Building Foundation Models for Control through Dreaming
Recent advances in reinforcement learning point to a shift from learning only through direct environment interaction toward learning through imagination. The Dreamer family embodies this idea by learning a latent world model of the environment and training policies inside that learned simulator.
This talk follows the evolution of the approach from PlaNet, which introduced planning in latent space, to DreamerV1–V3, which replaced search-based planning with actor-critic learning in imagination and scaled the method across diverse domains. It then dives into DreamerV4, which reframes world models as large-scale generative models trained from offline video and action data, combining diffusion transformers with behavior cloning and reinforcement learning.
Beyond the technical details, the talk aims to build an intuitive mental model of learning by dreaming and discuss its implications for robotics and embodied intelligence. We will also ask whether world models are becoming foundation models for control, and examine the remaining challenges of long-horizon consistency, memory, and grounding.
Pre-Readings
🌍 Foundations of World Models
Ha & Schmidhuber — World Models (2018)
https://arxiv.org/abs/1803.10122
Introduced the core idea of learning a compressed latent simulator for control
Early framing of “agents that dream inside learned environments”
👉 This is where the “dreaming” metaphor for RL/world models first became explicit
🧭 PlaNet — Latent Planning Begins
PlaNet (2018)
https://arxiv.org/abs/1811.04551
First strong demonstration of planning directly in latent space (RSSM)
Uses CEM + MPC instead of learned policies
👉 Learn a world model, then search inside it
🧠 DreamerV1 — From Planning to Learning
Dream to Control (2019)
https://arxiv.org/abs/1912.01603
Replaces search/planning with actor-critic trained entirely in imagination
Keeps PlaNet-style RSSM world model, but changes how behavior is learned
👉 Agents no longer plan—they learn inside dreams
⚙️ DreamerV2 — Stabilizing the Dream
Mastering Atari with Discrete World Models (2020)
https://arxiv.org/abs/2010.02193
Introduces discrete latent states (categorical RSSM)
Adds KL balancing + entropy regularization for stability
👉 Makes imagination-based RL work at scale (Atari)
🚀 DreamerV3 — Scaling the Recipe
Mastering Diverse Domains through World Models (2023)
https://arxiv.org/abs/2301.04104
Unified training recipe across many domains with fixed hyperparameters
Symlog/symexp targets improve stability across reward scales
👉 World models become general-purpose learners across domains
🧱 DreamerV4 — World Models as Foundation Models
Training Agents Inside of Scalable World Models (2025)
https://arxiv.org/abs/2509.24527
Replaces RSSM with diffusion/transformer-based world models
Combines offline pretraining, behavior cloning, and RL (PMPO)
👉 Shift from “RL world model” → “generative foundation model for control”
🔧 Optional Context (Same Research Lineage)
DayDreamer (2022)
https://arxiv.org/abs/2206.14176
Real-world robot learning using Dreamer-style latent imagination
Demonstrates transfer from simulation-style learning to physical systems
Director (2022)
https://arxiv.org/abs/2206.04114
Hierarchical latent planning for long-horizon decision making
Extends Dreamer-style imagination into structured subgoal reasoning
Dynalang (2023)
https://arxiv.org/abs/2308.01399
Adds language conditioning to world models
Bridges perception, language, and action in a unified latent space.
Location
San Francisco (Downtown)
Date & Time
Saturday, May 9, 2026 | 2:00 PM – 5:00 PM
Join Discord Community
https://discord.gg/WH7DrTHRXK
Follow Saturday Robotics on X
https://x.com/saturdayrobotic
Agenda
2:00 PM – 2:30 PM Door Opens & Social
Food 😋, beverages🧋 and UNLIMITED strawberries 🍓 (our official reading club fruits ☺️😄).
2:30 PM – 3:00 PM Keynote by Ahmet Şemi ASARKAYA (
Agility Robotics)
Online access via Zoom: TBD
YouTube Recording: TBD (We are looking for recording volunteers)
3:00 PM – 5:00 PM Q&A, open-floor roundtable (10–20 min per topic) on spotlight papers or any paper you’d like to highlight. Feel free to share why the paper matters and its technical details.
Past events
#reading-club-03-0411
Session 03 Luma: https://luma.com/561xgirg
Reading Club 03 Review: https://x.com/junfanzhu98/status/2043243484568768519?s=20
Event summary & photos: https://x.com/junfanzhu98/status/2043245823933477004?s=20
LinkedIn more photos: https://www.linkedin.com/posts/junfan-zhu_robotics-world-model-reading-club-03-robotic-ugcPost-7448991959093932032-Vo5d?utm_source=share&utm_medium=member_desktop&rcm=ACoAABxP-p0BpUNGDf347aKh_1uJAPzG4er0As8
#reading-club-02-0404
Session 02 Luma: https://luma.com/g3qrrti0
Reading Club 02 Review (liked by Yann LeCun on X): https://x.com/junfanzhu98/status/2040716119259164673?s=20
Event photos: https://x.com/junfanzhu98/status/2040717084972245341?s=20
LinkedIn more photos: https://www.linkedin.com/posts/junfan-zhu_robotics-world-model-reading-club-02-hot-ugcPost-7446465600723509248-1qbj?utm_source=share&utm_medium=member_desktop&rcm=ACoAABxP-p0BpUNGDf347aKh_1uJAPzG4er0As8
#reading-club-01-0328
Session 01 Luma: https://luma.com/8s4w1wu6
Reading Club 01 Review: https://x.com/junfanzhu98/status/2038153945219305812
Event photos (liked by Yann LeCun on X): https://x.com/junfanzhu98/status/2038161288090779985
Logistics
Spots are limited. Please arrive by 2:00 PM for check-in. Keynote will begin promptly at 2:30 PM.
We currently do not have volunteers available to assist with late check-ins. Given the high volume of inquiries and 100+ attendees (both online and onsite), we kindly ask that you arrive on time to ensure smooth entry.