Cover Image for vLLM Inference Meetup Warsaw
Cover Image for vLLM Inference Meetup Warsaw
Avatar for vLLM Meetups and Events
Join the vLLM community to discuss optimizing LLM inference!
75 Going
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

vLLM Inference Meetup in Warsaw

Organized and sponsored by JetBrains AI in collaboration with Red Hat AI and NVIDIA, this event takes place on 10 March 2026 in Warsaw, Poland.

What to Expect

  • Deep technical sessions from vLLM maintainers, committers, and teams using vLLM at scale

  • Live demos focused on real workflows (local, cloud, IDE integrations, omni-modality, Kubernetes)

  • Hands-on workshop option earlier in the day

  • Great networking with food and drinks

Who Should Attend

  • vLLM users and contributors

  • ML and infra engineers working on inference and serving

  • Platform teams running GenAI in production

  • Anyone curious about efficient inference across local, cloud, and Kubernetes

Agenda

Optional Workshop

15:30 — Doors Open (Workshop Attendees)

16:00 – 17:15 — Hands-On Workshop: Latest GenAI Compressing Techniques in Practice

Meetup Program

17:15 – 17:45 — Doors Open, Check-In

17:45 – 17:55 — Welcome and Opening Remarks

Saša Zelenović, Sr. Manager of Developer Marketing & Advocacy, Red Hat AI

Damian Bogunowicz, Sr. ML Engineer, JetBrains AI

17:55 – 18:15 — Intro to vLLM and Project Update

Michael Goin, vLLM Core Committer and Principal Engineer, Red Hat AI

18:15 – 18:35 — Powering JetBrains IDE Features with AI

JetBrains AI Team

18:35 – 18:55 — Demo: Next Edit Suggestion (NES), Using vLLM to Enable Local and Cloud-Based Edits

JetBrains AI Team

18:55 – 19:15 — CPU Weights Offloading in vLLM with FlexTensor

Blazej Kubiak, Sr. Deep Learning Software Engineer, NVIDIA

19:15 – 19:25 — Coffee Break

19:25 – 19:45 — Intro to vLLM-Omni: Easy, Fast, and Cheap Omni-Modality Model Serving

Nicolò Lucchesi, vLLM Core Committer and Sr. Software Engineer, Red Hat AI

19:45 – 20:00 — Scaling LLM Inference on Kubernetes with llm-d: Fast, Cost-Efficient, Production-Ready with vLLM

Michael Goin, vLLM Maintainer and Principal Engineer, Red Hat AI

20:00 – 21:00 — Networking, Food and Drinks

Important information

Agenda is subject to change. We may add extra demos or lightning updates.

Registration closes 24 hours before the event. We cannot admit unregistered attendees.

Please bring a photo ID to verify your registration on arrival.

See you in Warsaw

If you are building, deploying, or scaling inference, this is the room to be in. See you soon!

Location
Crowne Plaza Warsaw - the Hub by IHG
rondo Ignacego Daszyńskiego 2, 02-843 Warszawa, Poland
Avatar for vLLM Meetups and Events
Join the vLLM community to discuss optimizing LLM inference!
75 Going