Cover Image for Online workshop: Build evals from real production data
Cover Image for Online workshop: Build evals from real production data
Avatar for Braintrust
Presented by
Braintrust
Braintrust is the AI observability platform for shipping quality AI products.
188 Going

Online workshop: Build evals from real production data

Zoom
Registration
Welcome! To join the event, please register below.
About Event

Production traces capture where your AI falls short and what users are trying to do. Building evals from that data is how you catch failures earlier and make better calls about what ships next.

In this session, Amanda Gilbert shows how to take the patterns Braintrust surfaces automatically, turn them into a labeled eval dataset, and run the same workflow every time a new pattern shows up.

What you'll learn

  • How Braintrust groups production traces into named failure patterns you can act on

  • How to filter a failure cluster into a labeled eval dataset

  • How to write an eval that targets a specific failure pattern and validate the fix held

  • How to run a repeatable diagnosis-to-eval workflow in Braintrust

Avatar for Braintrust
Presented by
Braintrust
Braintrust is the AI observability platform for shipping quality AI products.
188 Going