Webinar: Scaling Diffusion Models in Production

Hosted by Simplismart AI & Daksh Goel

Zoom

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Diffusion models are in production everywhere now, image generation, video synthesis, avatars, creative tooling, document processing.

But the engineering reality behind those demos is messier than anyone talks about publicly.

Latency spikes. GPU costs that compound with every model update. Pipelines that weren't designed for real user load. Cold starts that quietly kill conversion. Infrastructure that works fine at 100 requests and falls apart at 10,000.

Most teams are solving this in isolation through incidents, over-provisioning, and trial and error.

This webinar changes that.

Join us for a live panel with engineering leaders & researchers who are actively running diffusion workloads in production. No slide decks. No vendor pitches. Just a moderated, practitioner-level conversation about what's actually working - and what isn't.

What we'll cover:

Architecting inference pipelines for spiky, bursty diffusion workloads
GPU cost reality: what optimization actually looks like beyond the theory
Multi-tenancy in generative workloads - isolation, scheduling, fairness
Latency vs. quality tradeoffs and how to communicate them to product teams
Observability for diffusion: the metrics that actually matter
Where diffusion infrastructure is heading as models get heavier

Moderated by Bharatratna Puli, GTM at Simplismart, who works daily with teams scaling inference across clouds, hyperscalers, and data centers.

Panelist:
- Karthik Kumar, Senior AI Researcher, Tavus
- Rahul Deora, AI Lead, Fynd
more panelists to be announced soon

Register now!

Hosted By

105 Going

AI