Cover Image for Dynamo & Dine: High-performance LLM Inference with Baseten and NVIDIA Dynamo
Cover Image for Dynamo & Dine: High-performance LLM Inference with Baseten and NVIDIA Dynamo
Hosted By
Private Event

Dynamo & Dine: High-performance LLM Inference with Baseten and NVIDIA Dynamo

Hosted by Baseten
Registration
Approval Required
Your registration is subject to approval by the host.
Welcome! To join the event, please register below.
About Event

Join us for a hands-on technical workshop and Brazilian churrasco experience at Fogo de Chão.

Discover how the world's largest AI inference workloads run at lightning speed on NVIDIA Dynamo, a distributed system for model serving.

In this 1-hour workshop, Harry Kim (NVIDIA) and Philip Kiely (Baseten) will dive deep into system-level optimizations that turbocharge LLM inference at scale, including:

  • KV-aware routing

  • KV cache offloading

  • PD disaggregation

After the session and Q&A, stay for a churrasco lunch. Enjoy eight different meats, a fresh salad bar, and traditional sides.

​If you’re an AI engineer in SF, don’t miss this technical workshop and chance to network with peers. Lunch is on Nvidia and Baseten!

​​​​​✅ Follow Baseten on Twitter & Linkedin
✅ Follow Nvidia on Twitter & Linkedin

Location
Fogo de Chão Brazilian Steakhouse
201 3rd St Suite 100, San Francisco, CA 94103, USA
Hosted By