

Robotics & World Model Reading Club 02: JEPA Zoo, WAMs & Unified Latent Representations – San Francisco
Robotics & World Model Reading Club 02: JEPA, WAMs & Unified Latent Representations – San Francisco
A high-signal reading group for AI researchers & builders pushing the frontiers of robotic world models, WAMs, and physical AI / embodied intelligence. In our first session, we brought together researchers and engineers from Boston Dynamics, Google, NVIDIA, Stanford and several robotics startups across the Bay Area.
Hosted by Junfan Zhu & Aurora Feng
Supported by Neural Motion, a universal cross-embodiment data representation layer for embodied AI.
This Week's Core Theme & Pre-Readings
JEPA-Based World Models: Compact Latent Prediction for Scalable Planning & Robotics Autonomy
LeWorldModel (Mar 2026): End-to-end JEPA world model trained directly from raw pixels on a single GPU, demonstrating efficient latent-space planning and large speed improvements over prior world-model baselines.
https://le-wm.github.ioLeJEPA (Nov 2025): Energy-based JEPA formulation addressing feature collapse with a simple two-term objective, enabling stable end-to-end training without complex losses or EMA tricks.
V-JEPA 2.1 (Meta AI, Mar 2026): Dense predictive representation learning for image/video tokens with strong performance on dense vision tasks and robotics perception.
VL-JEPA (Dec 2025): Multimodal JEPA aligning visual dynamics with language, enabling latent planning and embodied reasoning.
ThinkJEPA (arXiv:2603.22281, Mar 2026): Extends JEPA with a semantic reasoning pathway for long-horizon planning and hierarchical decision making.
World Reasoning & Evaluation Benchmarks
World Reasoning Arena (WR-Arena) – PAN Team (Eric Xing), MBZUAI (arXiv:2603.25887, March 26, 2026): Benchmark for action simulation fidelity, long-horizon forecasting, and simulative reasoning/planning, highlighting large gaps to human-level hypothetical reasoning.
Recommended Readings (Optional)
World Action Models (WAMs): From Reactive Policies to Controllable Simulation
DreamZero (NVIDIA – arXiv:2602.15922, Feb 2026)
Fast-WAM (arXiv:2603.16666, March 2026)
Unified 4D Latent Representations & Physics Grounding
D4RT (Dynamic 4D Reconstruction and Tracking, Google DeepMind – arXiv:2512.08924, Dec 2025)
Precise Refinement, Human Interfaces & Online Adaptation
RL Tokens (RLT) – Precise Manipulation with Efficient Online RL (Physical Intelligence, March 2026)
Compliant Residual DAgger (CR-DAgger, Shuran Song Lab)
UMI-FT (Shuran Song Lab, arXiv:2601.09988, Jan 2026)
Cross-Embodiment Generalization & Data Scaling
EgoScale (NVIDIA, arXiv:2602.16710, Feb 2026)
EgoVerse (NVIDIA, 2026)
AirExo-2 (2026)
LAP (Language-Action Pre-training) / LAP-3B (arXiv:2602.10556, Feb 2026)
Frontier Systems
Gr00t N2 (NVIDIA, 2026)
Cosmos-RL (NVIDIA, 2026)
RoboForge (arXiv:2603.17927, March 2026)
ColaVLA
VLM4VLA
Location
San Francisco (Downtown)
Date & Time
Saturday, April 4, 2026 | 2:00 PM – 5:00 PM
Join Discord Community
https://discord.gg/WH7DrTHRXK
Follow Saturday Robotics
https://x.com/saturdayrobotic
Agenda
2:00 PM – 2:30 PM Door Opens & Social
Food 😋, beverages🧋 and UNLIMITED strawberries 🍓 (our official reading club fruits ☺️😄).
2:30 PM – 3:00 PM "JEPA Zoo" Keynote by Julian Saks (https://x.com/JulianSaks)
References: https://www.jepazoo.com/
https://www.jepazoo.com/references
Virtual Keynote via Google Meet: https://meet.google.com/uky-ourj-agm
3:00 PM – 5:00 PM Q&A, open-floor roundtable (10–20 min per topic) on spotlight papers or any paper you’d like to highlight. Feel free to share why the paper matters and its technical details.
Past events #reading-club-0328
Session 01 Luma: https://luma.com/8s4w1wu6
Reading Club 01 Review: https://x.com/junfanzhu98/status/2038153945219305812
Event photos (liked by Yann LeCun on X): https://x.com/junfanzhu98/status/2038161288090779985