Cover Image for High Performance LLM Inference in Production
Cover Image for High Performance LLM Inference in Production
Avatar for Modal
Presented by
Modal
AI infrastructure that developers love
Hosted By

High Performance LLM Inference in Production

Virtual
Registration
Welcome! To join the event, please register below.
About Event

The era of actually open AI is here. We’ve spent the past year helping leading organizations deploy open models and inference engines in production at scale.

Hosted by Charles Frye, this live session will walk you through:

  1. The three types of LLM workloads: offline, online and semi-online.

  2. The challenges engineers face and our recommended solutions to control cost, latency, and throughput

  3. How you can implement those solutions on our cloud platform

Avatar for Modal
Presented by
Modal
AI infrastructure that developers love
Hosted By