

vLLM Inference Meetup: Pune, India
We are excited to invite you to the inaugural vLLM meetup in India hosted by Red Hat at Pune.
This is your chance to connect with a growing community of vLLM users, developers, maintainers, and engineers from Red Hat. We'll dive deep into technical talks, share insights, and discuss our journey in optimizing LLM inference for performance and efficiency.
What to expect:
Technical insights
Networking with industry experts
Hands-on learning & demos
Agenda
09:30-10:00: Registration and Opening Remarks
10:00-10:30: Keynote: Turning GenAI Investments into Results: Why Inference Matters. - Steve Shirkey
10:30-11:00: AI Inference - Ompragash
11:00-11:30: Intro to vLLM and it's techniques: Quantization, KV Cache, Paged-Attention, and Continuous Batching - Prasad Mukhedkar
11:30-12:00: vLLM Inference Demo – NVIDIA GPU Accelerator - Suresh Gaikwad
12:00-12:30: Break
12:30-02:00: Hands-on Lab: vLLM Inference
Bring your laptop with SSH installed. GPU instances provided by organizers.
Hosts:
[email protected]
[email protected]