Cover Image for How to scale inference for agents
Cover Image for How to scale inference for agents
1 Going

How to scale inference for agents

Hosted by Thiyagarajan M & Vamshi Ambati
Register to See Address
San Jose, California
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Every agent interaction triggers 10, 20, 50 LLM calls at the serving layer. Latency stacks across reasoning steps. Reliability drops at compound scale. Most teams are inference blind here. Shipping agents without seeing what's happening underneath.

This roundtable is about seeing things clearly.

Topics we'll cover

  • Agentic inference economics. What to measure, what to optimize, what to leave alone.

  • Model routing in production. Large for planning, small for execution, and where it breaks.

  • Context window explosion across agent steps. KV-cache, summarization, memory architectures

  • Orchestration at production load. What survives real traffic vs. what works in demos.

Facilitator

Vamshi Ambati · ML Leader, Omniva Neo Cloud (acq Predera), prev VISA. CMU PhD in AI. https://www.linkedin.com/in/vamshiambati/

Host

Thiyagarajan M · PeakInference Forum & Founder, Kalmantic Labs

https://www.linkedin.com/in/thiyagarajan/


Details

  • Date: Apr 24, 2026

  • Time: 2 hours + break

  • Location: San Francisco (shared on confirmation)

  • Size: 15 participants

  • Format: Invite-only roundtable.


Who this is for

You're running agents in production. You have engineers working on inference or orchestration. You've hit latency or reliability compounding firsthand.

This won't be useful if you're still prototyping or haven't shipped agents to real users.


To apply

  1. What does your agent architecture look like today?

  2. How many agent runs per day? Average LLM calls per run?

  3. Top 3 inference challenges right now

  4. One thing you want to walk away knowing

Organized by PeakInference Forum. To know more peakinference.org

Location
Please register to see the exact location of this event.
San Jose, California
1 Going