

Distributed Inference Meetup NYC
llm-d Distributed Inference Meetup NYC
Hosted by Red Hat AI, IBM Research, and AMD, this event takes place on March 11, 2026 in New York City.
What to Expect
Deep technical sessions from llm-d maintainers, committers, and teams using AI at scale
Live demos focused on real distributed workflows
Great networking with food and drinks
Who Should Attend
ML and infra engineers working on inference and serving
Platform teams running GenAI in production
Anyone curious about efficient inference across local, cloud, and Kubernetes
Meetup Agenda
*Agenda is subject to change and more awesomeness
4:30pm — Doors Open, Check-In
5:15pm — Intro to llm-d for Open Source Distributed Inference & Project Update
Carlos Costa, Distinguished Engineer, IBM Research
5:35pm — Distributed LLM Serving on AMD with llm-d
Liz Li, Product Application Engineer, AMD
Kenny Roche, Sr. Manager Software Development, ML Framework, AMD
5:55pm — The Path to Intelligent Routing: Lessons Learned Scaling Wide-EP and Mixture-of-Experts (MoE) Models
Nili Guy, R&D Manager, AI Infrastructure - Hybrid Cloud, IBM Research
Tyler Smith, Chief Architect - Inference Engineering, Red Hat AI
6:15pm — KV-Cache Wins You Can See: Prefix-Cache Scheduling, Offloading, and Scaling with llm-d
Maroon Ayoub, Staff Research Scientist, IBM Research
6:35pm — Q&A and Open Discussion
7:00pm — Pizza and Networking 🍕 🤝
8:30pm — Event Ends and Door Close
Important information
Registration closes 24 hours before the event. We cannot admit unregistered attendees.
Please bring a photo ID to verify your registration on arrival.
See you in New York City
If you are building, deploying, and scaling inference, this is the room to be in. See you soon!