

vLLM Inference Meetup Pune
vLLM Inference Meetup in Pune
We are excited to invite you to the vLLM meetup in Pune hosted by Red Hat AI, NeevCloud, and HPE on 14 March 2026.
This meetup brings together vLLM users, developers, maintainers, and engineers to explore the latest in optimised inference. Expect in-depth technical talks, practical demonstrations, and ample time to connect with the community.
What to Expect
Technical insights
Networking with industry experts
Hands-on learning & demos
What to Bring
Your laptop with SSH installed (GPU instances provided by the organizers)
A government‑issued photo ID for venue security
Curiosity for tech insights and demos!
Agenda
01:00 - 01:10 Opening Remarks & Welcome
Rohit Sharma, Prasad Mukhedkar
01:10 - 01:30 Keynote: Why GenAI Inference Matters
Steve Shirkey (Red Hat)
01:30 - 02:00 vLLM Inference Engine – Technical Introduction & Demo
Shamsher Ansari (NeevCloud), Deepak Das (Red Hat)
02:00 - 02:30 Optimizing LLM Inference (llmcompressor & Speculators)
Rahul Tuli (Red Hat)
02:30 - 03:00 Networking Break / Snacks
03:00 - 03:25 vLLM Semantic Router – Mixture of Models
Ritesh Shah (Red Hat), Suresh Gaikwad
03:25 - 03:50 Practitioner Talk: Agentic AI on vLLM
Swapnil Shekade (HPE)
03:50 - 04:15 Open-Source Generative AI with NVIDIA Nematron
Nimit Kothari (NVIDIA)
04:15 - 04:20 Lab Setup / Transition
04:20 - 06:00 Hands-on Workshop All Speakers
Important information
Agenda is subject to change. We may add extra demos or lightning updates.
Registration closes 24 hours before the event. We cannot admit unregistered attendees.
Please bring a photo ID to verify your registration on arrival.
See you in Pune!
If you are building, deploying, or scaling inference, this is the room to be in. See you soon!