Cover Image for London vLLM & llm-d Inference Meetup

Presented by

vLLM Meetups and Events

Join the vLLM community to discuss optimizing LLM inference!

Hosted By

248 Went

AI

Featured in

London Tech Week

London vLLM & llm-d Inference Meetup

Name: London vLLM & llm-d Inference Meetup
Start: 2026-06-10T17:00:00.000+01:00
End: 2026-06-10T21:00:00.000+01:00
Location: Sustainable Ventures

vLLM Meetups and Events

Sustainable Ventures

London, United Kingdom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

United Kingdom vLLM & llm-d Inference Meetup

Hosted by Red Hat AI, NVIDIA, and Stelia AI, this event takes place on 10 June 2026 in London, UK.

Join us for a deep dive into the engine room of vLLM and llm-d AI inferencing, where we will focus on the architecture, optimizations, and raw engineering required to run inference at scale.

Whether you’re looking to squeeze every last token out of your GPU cluster or you're curious about the latest commits to the vLLM and llm-d ecosystems, this is the room you want to be in.

What to Expect

Deep Technical Sessions: Hear directly from the maintainers and core committers of vLLM and llm-d
Scale in Production: Learn from industry leaders about deploying LLMs in production
Live Demos: See live demos focused on real-world workflows
DGX Spark Giveaway: NVIDIA is giving away a DGX Spark at the event. You must be present to win.
Networking: Stick around for food and drinks. It’s a great chance to chat with the speakers and exchange ideas with fellow developers and engineers.

Who Should Attend

vLLM and llm-d users and contributors
ML and infra engineers working on inference and serving
Platform teams running GenAI in production
Anyone curious about efficient inference across local, cloud, and Kubernetes

Agenda (Subject to More Awesomeness)

17:00 – 17:30 — Doors Open, Check-In

17:30 – 17:40 — Welcome and Opening Remarks

Sasa Zelenovic, Sr. Technical Marketing Manager, Red Hat AI

17:40 – 18:10 — Intro to vLLM and Project Update

Michael Goin, vLLM Core Committer and Principal Engineer, Red Hat AI

18:10 – 18:30 — Accelerating AI Inference with Speculative Decoding

Eldar Kurtić, Principal Research Scientist, Red Hat AI & ISTA

18:30 – 18:50 — From Eval to Production: Building a Managed vLLM Service with llm-d

David Hughes, Chief Technology Officer, Stelia AI

18:50 – 19:10 — Model Express: From Cold Start to Hot Tokens

Ganesh Kudleppanavar, System Software Manager, NVIDIA

19:10 – 19:20 — Test, Defend, Prove: Data-Driven AI Safety

Stuart Battersby, AI Safety & Model Evaluation Architect, Red Hat AI

19:20 – 19:40 — Discussion and Q&A

19:40 – 21:00 — Networking, Food and Drinks

Important information

Registration closes 24 hours before the event. We cannot admit unregistered attendees.

Please bring a photo ID to verify your registration on arrival.

See you in London

If you are building, deploying, or scaling inference, this is the room to be in.

See you soon!

Location

Sustainable Ventures

County Hall, Belvedere Rd, London SE1 7PB, UK

County Hall, 5th Floor

Presented by

vLLM Meetups and Events

Join the vLLM community to discuss optimizing LLM inference!

Hosted By

248 Went

AI

London vLLM & llm-d Inference Meetup

​United Kingdom vLLM & llm-d Inference Meetup

​What to Expect

​Who Should Attend

​Agenda (Subject to More Awesomeness)

​Important information

​See you in London

United Kingdom vLLM & llm-d Inference Meetup

What to Expect

Who Should Attend

Agenda (Subject to More Awesomeness)

Important information

See you in London