


SGLang x NVIDIA Dynamo: An Evening About Inference Performance at Scale
🚀 Unlock the Future of Inference at Scale
Join us for an evening of cutting-edge insights into the strategies, frameworks, and breakthroughs that are making inference faster, more efficient, and production-ready. Hear directly from experts pushing the limits of distributed AI, open-source innovation, and next-gen optimization—and connect with the community shaping what’s next.
Presenters:
Ishan Dhanani: A member of NVIDIA's Deep Learning Algorithms team and one of the core engineers on NVIDIA Dynamo.
Baizhou Zhang: A master student at UCSD, Baizhou is one of the contributors of open-source SGLang project and worked on the Nvidia CuDNN team as an intern.
Qiaolin Yu: Core developer at SGLang.
Agenda
5:30 PM – Doors Open & Check-In
6:00 PM – Presentations
⚡ Optimizing DeepSeek on GB200 NVL72 – Explore how the SGLang project is achieving unprecedented performance with optimized kernels, PD disaggregation, and large-scale EP.
🔮 SGLang Roadmap – A look back, a dive into the present, and a glimpse of where the project is headed.
🌐 Scaling the Frontier of Distributed Inference – Learn how SGLang and Dynamo are redefining what’s possible for large-scale deployment.
7:30 PM – Networking