

Defining Agent Quality for AI Engineers & PMs
Most AI agent projects don't fail because the technology doesn't work. They fail because teams have no systematic way to define, measure, or prove quality before production finds the gaps.
So it's time to fix that.
In this webinar, Manouk Draisma (LangWatch CEO) and Rogerio Chaves (LangWatch CTO) will walk through a practical framework for defining agent quality from the ground up — before a single eval is written.
By the end, you'll understand:
Why shipping on intuition creates failures that are hard to reproduce and harder to fix
How to specify agent behavior before writing code — what it should do, what it must never do, and how it should handle edge cases
Who owns quality in practice and what the collaboration between engineers, domain experts, and product actually looks like
How to build an evaluation framework using deterministic, model-based, and domain-specific grading
How to set a launch threshold that gives your team confidence to ship
How production failures should continuously sharpen your quality definition over time
Whether you're building your first production agent, struggling with regressions after every model update, or trying to establish a quality baseline your team can actually trust — this session will give you a concrete process to move from intuition to systematic quality.