

Agent Evals: The Truth Machine w/ Composio, Braintrust, Fireworks, and Replit
How leading AI teams test, trace, and trust agents in production?
This is the go-to panel for builders who care about evaluating agents as they are heading into production, not just impressive demos.
Join Fireworks, Braintrust, Replit, and Composio for a conversation on agent evals, reliability, observability, and the new infrastructure needed to make AI agents trustworthy in the real world.
Time & Location
Time: June 23rd Tuesday, 6:30PM - 8:30 PM
Location: Notion HQ, SF
Featuring
Ankur Goyal, CEO of Braintrust
Soham Ganatra, CEO of Composio
Dima Dzhulgakov, CTO of Fireworks AI
Cohosts
Braintrust
Braintrust is the AI observability and evals platform for teams building production AI systems. It helps builders trace real-world behavior, run evaluations, and catch regressions before they reach users.
Composio
Composio is the agent action harness, empowering agents to take actions on 1000+ apps, 10000+ tools, APIs, auth, and real-world workflows. It helps developers move agents from isolated demos to systems that can actually take reliable action.
Fireworks AI
Fireworks AI provides fast, scalable infrastructure for serving, fine-tuning, and deploying generative AI models. For agent builders, Fireworks helps teams optimize model performance, latency, and reliability in production environments.
Replit
Replit is an AI-powered software creation platform where developers and non-developers can build, ship, and iterate on apps quickly. Its work on coding agents makes it a key voice in understanding how agents plan, write code, use tools, and recover from mistakes in real-world development.