Eval Engineering for AI Developers - Lesson 3: Failure analysis

Galileo Events

YouTube

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Learn Eval Engineering in this free, 5-part, hands-on course.

90% of AI agents don't make it successfully to production. The biggest reason is the AI engineers building these apps don't have a clear way of evaluating that these agents are doing what they should do, and using the results of this evaluation to fix them.

In this course, you will learn all about evals for AI applications. You'll start with some out-of-the-box metrics and learn about evals, then move onto understanding observability for AI apps, analyzing failure states, defining custom metrics, then finally using these across your whole SDLC.

This will be hands on, so be prepared to write some code, create some metrics, and do some homework!

In this third lesson, you will

Learn the process for finding failues in your AI applications
Build out rubrics for identifying failure cases
Learn how to group failure cases to themes that can be used for building evals

Prerequisites:

A basic knowledge of Python
Access to an OpenAI API key
A free Galileo account (we will be using Galileo as the evals platform)

Future lessons

Lesson 4: https://luma.com/x2ztpa4f
Lesson 5: https://luma.com/esoi6izo

Presented by

Galileo Events

Calendar of events for AI evaluation company Galileo

Hosted By

249 Went

AI

Eval Engineering for AI Developers - Lesson 3: Failure analysis

​In this third lesson, you will

​Prerequisites:

​Future lessons

In this third lesson, you will

Prerequisites:

Future lessons