Cover Image for NVIDIA GTC 2026 Conf Recap + Inference Disagg Prefill-Decode + RadixAttention
Cover Image for NVIDIA GTC 2026 Conf Recap + Inference Disagg Prefill-Decode + RadixAttention
Avatar for AI Performance Engineering
All things AI performance related including PyTorch, CUDA, and GPUs.
Hosted By
50 Went

NVIDIA GTC 2026 Conf Recap + Inference Disagg Prefill-Decode + RadixAttention

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Zoom link: https://us02web.zoom.us/j/82308186562

Talk #0: Introductions and Meetup Updates

by Chris Fregly and Antje Barth

Talk #1: Inference Engines Deep Dive: Disaggregated Serving, PagedAttention, RadixAttention, and the Modern LLM Serving Stack by Seth Weidman @ SentiLink and author of Deep Learning from Scratch (O’Reilly)

In this talk, Seth will go deep on how modern inference engines work end-to-end: disaggregated serving architectures, PagedAttention and RadixAttention, and how NVIDIA Dynamo interfaces with vLLM, SGLang, and TensorRT-LLM. He’ll also cover emerging KV-cache optimization/compression directions and practical tradeoffs for throughput, latency, and memory efficiency in production LLM systems.

Talk #2: NVIDIA GTC 2026 AI Conference Recap by Chris Fregly, author of AI Systems Performance Engineering (O'Reilly)

In this talk, Chris will present the AI and systems highlights from the NVIDIA GTC 2026 conference (happening the prior week.)

NVIDIA GTC Conference registration link:

https://www.nvidia.com/gtc/ (Use code GTC26-20 for 20% off!)

Zoom link: https://us02web.zoom.us/j/82308186562

Related Links

Github Repo: http://github.com/cfregly/ai-performance-engineering/

O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/

YouTube: https://www.youtube.com/@AIPerformanceEngineering

Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm

Avatar for AI Performance Engineering
All things AI performance related including PyTorch, CUDA, and GPUs.
Hosted By
50 Went