Cover Image for True Serverless Inference with Sub-Second Cold Starts
Cover Image for True Serverless Inference with Sub-Second Cold Starts
96 Went

True Serverless Inference with Sub-Second Cold Starts

Hosted by Prashanth Manohar
Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Cold starts aren’t solved. They’re just hidden behind pre-warmed GPUs.

We’ll show how we:

  • restore large models in sub-seconds

  • run multiple models on a single GPU

  • use vLLM with InferX’s Snapshot-based runtime.

Live demo + Q&A

96 Went