High-Performance Inference for Open LLMs with Modal, Qwen and SGLang

Name: High-Performance Inference for Open LLMs with Modal, Qwen and SGLang
Start: 2026-01-29T17:30:00.000-08:00
End: 2026-01-29T20:00:00.000-08:00
Location: 375 Alabama St

Modal

375 Alabama St

San Francisco, California

Past Event

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

When do open models and inference engines beat proprietary solutions?

Join Modal, Qwen, and SGLang for an evening optimizing performance and cost for LLM inference. Our speakers will cover:

The Cold Start Issue — how can AI infrastructure enable seamless AI experiences, right from the start?
Accelerating Open Models — how do inference engines work with model developers to achieve benchmarking goals?
Choosing the Best Model — how should developers choose the most effective model for their use case?

With:

Charles Frye, GPU Enjoyer & Developer Advocate at Modal
Qiaolin Yu, Performance Optimization at SGLang
Nishant Agrawal, Senior Solutions Architect at Alibaba Cloud

Agenda

We're excited to bring together founders, AI engineers, and ML systems researchers for an evening with:

Demos & Lightning Talks
Community, Pizza, Drinks

Your hosts,

Modal, Qwen & SGLang

Location

375 Alabama St

San Francisco, CA 94110, USA

Suite 490

Presented by

Modal

AI infrastructure that developers love

Hosted By

AI

High-Performance Inference for Open LLMs with Modal, Qwen and SGLang

​​​Agenda

Agenda