

High-Performance Inference for Open LLMs with Modal, Qwen and SGLang
When do open models and inference engines beat proprietary solutions?
Join Modal, Qwen, and SGLang for an evening optimizing performance and cost for LLM inference. Our speakers will cover:
The Cold Start Issue — how can AI infrastructure enable seamless AI experiences, right from the start?
Accelerating Open Models — how do inference engines work with model developers to achieve benchmarking goals?
Choosing the Best Model — how should developers choose the most effective model for their use case?
With:
Charles Frye, GPU Enjoyer & Developer Advocate at Modal
Qiaolin Yu, Performance Optimization at SGLang
Nishant Agrawal, Senior Solutions Architect at Alibaba Cloud
Agenda
We're excited to bring together founders, AI engineers, and ML systems researchers for an evening with:
Demos & Lightning Talks
Community, Pizza, Drinks
Your hosts,
Modal, Qwen & SGLang