

MolmoAct2: Building Open Science Robotics Foundation Models for Real-world Deployment
MolmoAct2 is a fully open Vision-Language-Action robotics foundation model built for practical real-world deployment, aiming to move robot learning beyond closed frontier systems, hardware-specific open models, and slow reasoning-based policies. It combines Molmo2-ER, a vision-language backbone specialized for spatial and embodied reasoning, with new open robotics datasets across low-to-medium cost platforms, including a large bimanual manipulation dataset, quality-filtered Franka data, and SO100/101 trajectories. The system also introduces MolmoAct2-FAST, an open action tokenizer trained across multiple robot embodiments, along with a redesigned architecture for continuous control and MolmoAct2-Think, an adaptive reasoning method that keeps geometric grounding while reducing latency. By releasing model weights, training code, action tokenization tools, and complete training data, MolmoAct2 frames open science as a path toward reproducible, accessible, and deployable robotics foundation models that the broader community can inspect, build on, and use in real-world settings.
Speaker Bio:
Haoquan:
Haoquan is an incoming CS PhD student at Stanford University, advised by Prof Fei-Fei Li as part of the Stanford Vision and Learning Lab. Previously, he obtained his BS degree from the University of Washington, where he was advised by Prof Ranjay Krishna, Prof Ali Farhadi, Prof Dieter Fox, and Prof Jenq-Neng Hwang. His research interests lie broadly in robot learning. In particular, he focuses on developing foundation models for robotic manipulation that are deployable in the real world and unlock novel capabilities.
Jiafei:
Jiafei Duan is an incoming Presidential Young Professor at the National University of Singapore, School of Computing and he leads the MAGIC Lab at NUS. He did his PhD student in Robotics and AI at the Paul G. Allen School of Computer Science & Engineering, University of Washington, co-advised by Ranjay Krishna and Dieter Fox . His research centers on robot learning, embodied AI, and building large-scale robotics foundation models. His work has received Best Paper, Spotlight, and Oral recognitions at venues including ICLR, UR, and RSS, and has been featured in MIT Technology Review, GeekWire, VentureBeat, and Business Wire.