Cover Image for [Hands-on Workshop] vLLM Reality Check: Understand, Deploy, Optimize, Automate

Presented by

Welcome to the official MLOps Community London chapter 😎 Join us on our #london channel in the MLOps Community Slack: https://mlops.community/join/

Hosted By

AI

[Hands-on Workshop] vLLM Reality Check: Understand, Deploy, Optimize, Automate

Name: [Hands-on Workshop] vLLM Reality Check: Understand, Deploy, Optimize, Automate
Start: 2025-11-12T10:00:00.000+00:00
End: 2025-11-12T14:00:00.000+00:00
Location: Hard Rock Cafe

MLOps Community London

Hard Rock Cafe

London, England

Sold Out

This event is sold out and no longer taking registrations.

About Event

What the Workshop Will be About:

This workshop will provide a deep dive into vLLM, a high-performance inference engine for large language models.

Participants will explore how vLLM works under the hood, how it optimizes model execution, and how to effectively deploy and manage it in production environments.

Key Points to Cover:

vLLM Internals - Understand the request lifecycle. How vLLM manages model families, and how GPU execution is orchestrated for optimal performance.
Transformer Architecture & vLLM Optimizations - Revisit transformer architecture foundations and learn how vLLM leverages techniques like continuous batching, PagedAttention to accelerate inference.
Advanced Deployment Strategies - Explore best practices for deploying vLLM across different environments, starting from single-GPU setups.
Automated management of inference engines - Discuss what are shortcomings of self serving, why inference providers exist, and test a deployment flow in Cast.ai

Expected Outcomes:

Unpack the vLLM deployment struggle
Benchmark performance gaps between manual and optimized configurations
Discover GPU optimization challenges - availability, selection, and scaling issues
See automated deployment solve it - same benchmark, better results
You'll be the one running the commands, hitting the walls, and discovering why leading teams automate their AI infrastructure with AI Enabler instead of building it from scratch.

Who it’s for:

AI/ML Engineers, MLOps, LLMOps, DevOps Engineers, and Platform Engineers running or planning to run models in production.

Pre-Workshop Requirements:

A laptop & basic knowledge of Kubernetes and LLMs

Meet Your Conductor:

Igor Šušić, a talented Staff Machine Learning Engineer will be driving the masterclass.

Food, drinks and good vibes will be provided during the workshop.

Location

Hard Rock Cafe

Criterion Building, 225-229 Piccadilly, London W1J 9HR, UK

Presented by

MLOps Community London

Welcome to the official MLOps Community London chapter 😎 Join us on our #london channel in the MLOps Community Slack: https://mlops.community/join/

Hosted By

AI

[Hands-on Workshop] vLLM Reality Check: Understand, Deploy, Optimize, Automate

​What the Workshop Will be About:

​Key Points to Cover:

​Expected Outcomes:

​Who it’s for:

​Pre-Workshop Requirements: