

2026: Evals Are The New QA
Evals Are The New QA
Building Reliable AI: Evals, Ground Truth, and Production Readiness
Description
We’re hosting a small, curated gathering of AI and engineering leaders in San Francisco.
This will be a focused presentation + panel discussion on how enterprise teams are building evaluation infrastructure for production AI systems.
We’ll cover:
Designing ground truth for real-world use cases
Building eval systems that actually reflect production behavior
Maintaining data quality at scale
Lessons learned from deploying and iterating on AI systems in production
It’s a small room of practitioners working on similar problems, with time for open discussion and conversation after the panel.
Who this is for
Senior SWE’s / ICs and leaders working on applied AI / ML systems
Teams building or scaling LLM-powered products
Folks thinking deeply about evals, reliability, and data quality
Note
This is an event with limited capacity. Requests to join are reviewed to keep the room highly relevant for attendees.