Cover Image for vLLM Inference Meetup Warsaw

Presented by

vLLM Meetups and Events

Join the vLLM community to discuss optimizing LLM inference!

Hosted By

75 Going

AI

Featured in

Warsaw

vLLM Inference Meetup Warsaw

Name: vLLM Inference Meetup Warsaw
Start: 2026-03-10T17:00:00.000+01:00
End: 2026-03-10T21:00:00.000+01:00
Location: Crowne Plaza Warsaw - the Hub by IHG

vLLM Meetups and Events

Crowne Plaza Warsaw - the Hub by IHG

Warszawa, Województwo mazowieckie

Approval Required

Your registration is subject to host approval.

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

vLLM Inference Meetup in Warsaw

Organized and sponsored by JetBrains AI in collaboration with Red Hat AI and NVIDIA, this event takes place on 10 March 2026 in Warsaw, Poland.

What to Expect

Deep technical sessions from vLLM maintainers, committers, and teams using vLLM at scale
Live demos focused on real workflows (local, cloud, IDE integrations, omni-modality, Kubernetes)
Hands-on workshop option earlier in the day
Great networking with food and drinks

Who Should Attend

vLLM users and contributors
ML and infra engineers working on inference and serving
Platform teams running GenAI in production
Anyone curious about efficient inference across local, cloud, and Kubernetes

Agenda

Optional Workshop

15:30 — Doors Open (Workshop Attendees)

16:00 – 17:15 — Hands-On Workshop: Latest GenAI Compressing Techniques in Practice

Meetup Program

17:15 – 17:45 — Doors Open, Check-In

17:45 – 17:55 — Welcome and Opening Remarks

Saša Zelenović, Sr. Manager of Developer Marketing & Advocacy, Red Hat AI
Damian Bogunowicz, Sr. ML Engineer, JetBrains AI

17:55 – 18:15 — Intro to vLLM and Project Update

Michael Goin, vLLM Core Committer and Principal Engineer, Red Hat AI

18:15 – 18:35 — Powering JetBrains IDE Features with AI

JetBrains AI Team

18:35 – 18:55 — Demo: Next Edit Suggestion (NES), Using vLLM to Enable Local and Cloud-Based Edits

JetBrains AI Team

18:55 – 19:15 — CPU Weights Offloading in vLLM with FlexTensor

Blazej Kubiak, Sr. Deep Learning Software Engineer, NVIDIA

19:15 – 19:25 — Coffee Break

19:25 – 19:45 — Intro to vLLM-Omni: Easy, Fast, and Cheap Omni-Modality Model Serving

Nicolò Lucchesi, vLLM Core Committer and Sr. Software Engineer, Red Hat AI

19:45 – 20:00 — Scaling LLM Inference on Kubernetes with llm-d: Fast, Cost-Efficient, Production-Ready with vLLM

Michael Goin, vLLM Maintainer and Principal Engineer, Red Hat AI

20:00 – 21:00 — Networking, Food and Drinks

Important information

Agenda is subject to change. We may add extra demos or lightning updates.

Registration closes 24 hours before the event. We cannot admit unregistered attendees.

Please bring a photo ID to verify your registration on arrival.

See you in Warsaw

If you are building, deploying, or scaling inference, this is the room to be in. See you soon!

Location

Crowne Plaza Warsaw - the Hub by IHG

rondo Ignacego Daszyńskiego 2, 02-843 Warszawa, Poland

Presented by

vLLM Meetups and Events

Join the vLLM community to discuss optimizing LLM inference!

Hosted By

75 Going

AI

vLLM Inference Meetup Warsaw

​vLLM Inference Meetup in Warsaw

​What to Expect

​Who Should Attend

​Agenda

​Optional Workshop

​Meetup Program

​Important information

​See you in Warsaw

vLLM Inference Meetup in Warsaw

What to Expect

Who Should Attend

Agenda

Optional Workshop

Meetup Program

Important information

See you in Warsaw