

Datadog Workshop: Evaluating and Iterating on AI Agents
Join Datadog for a hands-on workshop on how to test and monitor LLM applications and agentic systems. In this three-hour session, we will cover the full lifecycle of development, from early prototyping through monitoring in production.
The main focus will be on experimentation, where you will learn how to run structured tests, compare prompt and model parameters, and understand trade-offs in cost, latency, and reliability.
Participants will work through the key stages of the LLM development workflow:
Monitor: Trace prompts, responses, and agent steps to see how systems behave. Track latency, tokens, and errors to find bottlenecks.
Iterate: Test prompts, models, and configurations to compare accuracy, cost, and performance, using real traces from production.
Evaluate: Run built-in and custom checks to catch hallucinations, drift, unsafe responses, PII leaks, and injection attempts.
Optimize: Select the best setup by weighing accuracy, speed, cost, and security across models and prompts.
Unify: Connect LLM and agent behavior to backend services, infra, and user activity for full-stack visibility.
You’ll leave with practical techniques to debug issues, validate improvements, prevent regressions, and make evidence-based design choices that improve reliability, cost efficiency, and user experience across LLM applications and agentic workflows.
Who should attend:
This workshop is designed for AI engineers, ML practitioners, data scientists and developers building and running LLM applications and agentic systems.
Details:
Date/Time: Doors open at 5:30 PM; content runs 6:00 – 9:00 PM
Format: Lecture + hands-on labs (bring your laptop)
Level: Advanced, builder-focused
Food & beverages provided
Schedule
5:30 PM – Doors open for food, beverages, & conversations
6:00 PM – Welcome, intro to LLM apps and agentic systems
6:30 PM – Hands-on lab: Monitoring and troubleshooting agentic AI applications
7:15 PM – Break for food, beverages, & conversations
7:30 PM – Hands-on lab: Evaluating and optimizing model configurations
8:15 PM – ML engineer case study, Q&A
8:45 PM – Wrap up, resources, and next steps
Speakers
Charles Jacquet
Charles Jacquet is a Product Manager at Datadog, where he leads LLM and AI Agent Evaluation. He focuses on helping teams run experiments, build datasets, and assess the reliability and safety of AI applications. Before Datadog, Charles was a machine learning engineer then product manager at Gorgias, where he launched an LLM-powered QA system. His background bridges applied ML and product development, giving him a practical understanding of the challenges in building and scaling AI systems.