

Presented by
Braintrust
Braintrust is the AI observability platform for shipping quality AI products.
188 Going
Online workshop: Build evals from real production data
Registration
About Event
Production traces capture where your AI falls short and what users are trying to do. Building evals from that data is how you catch failures earlier and make better calls about what ships next.
In this session, Amanda Gilbert shows how to take the patterns Braintrust surfaces automatically, turn them into a labeled eval dataset, and run the same workflow every time a new pattern shows up.
What you'll learn
How Braintrust groups production traces into named failure patterns you can act on
How to filter a failure cluster into a labeled eval dataset
How to write an eval that targets a specific failure pattern and validate the fix held
How to run a repeatable diagnosis-to-eval workflow in Braintrust
Presented by
Braintrust
Braintrust is the AI observability platform for shipping quality AI products.
188 Going