Cover Image for Automating Evals, Scenario's with Claude code, Skills & LangWatch MCP
Cover Image for Automating Evals, Scenario's with Claude code, Skills & LangWatch MCP
Avatar for LangWatch
Presented by
LangWatch

Automating Evals, Scenario's with Claude code, Skills & LangWatch MCP

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Building reliable AI products requires more than prompts and models—it requires systematic testing, evaluation, and iteration. In this webinar, Rogeiro and Sergio will walk through how teams can automate evaluations and scenario testing using Claude Code, Skills, and LangWatch MCP to build more robust AI systems.

The session will explore how both engineers and product managers can design, automate, and scale evaluation workflows that continuously validate AI behavior across real-world scenarios. You'll see how developer tools like Claude Code and Skills can be combined with LangWatch MCP to simulate agent behavior, test edge cases, and monitor performance before releasing AI features to users.

Together, Rogeiro and Sergio will demonstrate practical approaches for turning evaluation from a manual process into an automated system that supports faster experimentation, safer releases, and better AI product quality.

What you'll learn

  • How to automate AI evaluations using Claude Code and LangWatch

  • How to design realistic scenarios and simulations for testing AI systems

  • How engineers can integrate evals into their development workflow

  • How product managers can define evaluation criteria and measure AI performance

  • How LangWatch MCP enables scalable testing and monitoring of AI behavior

Whether you're building AI-powered products, managing AI features, or developing agent-based systems, this session will provide practical insights into creating repeatable evaluation workflows that help teams ship AI with confidence.

Avatar for LangWatch
Presented by
LangWatch