Cover Image for Webinar: Scaling Diffusion Models in Production
Cover Image for Webinar: Scaling Diffusion Models in Production
105 Going

Webinar: Scaling Diffusion Models in Production

Hosted by Simplismart AI & Daksh Goel
Zoom
Registration
Welcome! To join the event, please register below.
About Event

Diffusion models are in production everywhere now, image generation, video synthesis, avatars, creative tooling, document processing.

But the engineering reality behind those demos is messier than anyone talks about publicly.

Latency spikes. GPU costs that compound with every model update. Pipelines that weren't designed for real user load. Cold starts that quietly kill conversion. Infrastructure that works fine at 100 requests and falls apart at 10,000.

Most teams are solving this in isolation through incidents, over-provisioning, and trial and error.

This webinar changes that.

Join us for a live panel with engineering leaders & researchers who are actively running diffusion workloads in production. No slide decks. No vendor pitches. Just a moderated, practitioner-level conversation about what's actually working - and what isn't.

What we'll cover:

  • Architecting inference pipelines for spiky, bursty diffusion workloads

  • GPU cost reality: what optimization actually looks like beyond the theory

  • Multi-tenancy in generative workloads - isolation, scheduling, fairness

  • Latency vs. quality tradeoffs and how to communicate them to product teams

  • Observability for diffusion: the metrics that actually matter

  • Where diffusion infrastructure is heading as models get heavier

Moderated by Bharatratna Puli, GTM at Simplismart, who works daily with teams scaling inference across clouds, hyperscalers, and data centers.

Panelist:
- Karthik Kumar, Senior AI Researcher, Tavus
- Rahul Deora, AI Lead, Fynd
more panelists to be announced soon

Register now!

105 Going