Cover Image for Measure what matters: Intro to AI evals for common use cases
Cover Image for Measure what matters: Intro to AI evals for common use cases
Avatar for Braintrust
Presented by
Braintrust
The evals and observability platform for building reliable AI agents.
Hosted By

Measure what matters: Intro to AI evals for common use cases

Zoom
Registration
Welcome! To join the event, please register below.
About Event

This session will break down the basics through practical examples that are suitable for both AI engineers and PMs.

We'll work through how to evaluate three common use cases:

  • Customer support agent: "Is my AI support agent ready for customers?"

  • Content/code generation: "How do I know if my AI's output is actually good?"

  • Prompt optimization and model testing: "Which version of my AI setup works better?"

We'll also cover how to write good scoring functions and manage datasets. No prior evaluation experience is required. Framework and model-agnostic approaches that work with any AI application.

Avatar for Braintrust
Presented by
Braintrust
The evals and observability platform for building reliable AI agents.
Hosted By