

Agentic Collaboration with Humans
Scaling individual LLMs has brought impressive progress, but the next frontier is scaling collaboration through multi-agent systems (MAS). Purely autonomous MAS, however, remain “closed-world” systems limited by the static knowledge of pre-trained models, making them brittle when facing novel or out-of-distribution tasks.
In this talk, we introduce Human-In-the-Loop Multi-Agent Collaboration (HILA), a principled framework that enables agents to learn when to act autonomously and when to defer to human experts. To support this, we propose Dual-Loop Policy Optimization (DLPO), which separates short-term decision optimization from long-term capability growth. The inner loop optimizes deferral decisions with cost-aware policy learning, while the outer loop converts expert feedback into continual improvements in reasoning ability.
Results on challenging reasoning benchmarks show that HILA consistently outperforms advanced multi-agent systems, providing a foundation for scalable, collaborative, and continually improving agentic AI.
🎙 Speaker
Wei Yang is a Ph.D. student in Computer Science at the University of Southern California, advised by Jesse Thomason. His research focuses on multi-agent LLM collaboration, agentic LLM post-training, and multi-agent reinforcement learning.
He has interned at ByteDance and Tencent, and his work has been featured by MIT Technology Review. He has published in leading venues including ICLR, NeurIPS, WWW, SIGIR, MLSys, IJCAI, and TMLR, and serves as a reviewer for major conferences and journals.