Cover Image for Agentic + AI Observability Meetup SF

Presented by

A practical meetup series for Data + AI engineers and practitioners who want to go deeper on agents, exploring real-world patterns, failure modes, and tooling for building reliable agentic systems

Hosted By

268 Going

Featured in

San Francisco

Agentic + AI Observability Meetup SF

Name: Agentic + AI Observability Meetup SF
Start: 2026-02-17T17:00:00.000-08:00
End: 2026-02-17T20:00:00.000-08:00
Location: Databricks Inc.

Agentic + AI Observability

Databricks Inc.

San Francisco, California

Event Full

If you’d like, you can join the waitlist.

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

Join us for Agentic + AI Observability meetup on Tuesday. February 17 from 5pm - 8pm PST at the Databricks SF office, an evening focused on agentic architectures and AI observability: how to design, ship, and monitor AI agents that actually work in production.

This meetup is built for engineers, ML practitioners, and AI startup founders who are already experimenting with agents (or planning to) and want to go deeper into the tech. We’ll cover real-world patterns, failure modes, and tooling for building reliable agentic systems in the broader open-source ecosystem.

Whether you’re at an early-stage startup or an established company, if you care about getting AI agents into production, and keeping them healthy, this meetup is for you.

Why you should attend

See real architectures: Learn how teams are designing agentic systems on top of data/feature platforms, retrieval, and tools, not just calling a single LLM endpoint.
Learn how to observe what agents are doing: Go beyond logs and dashboards to structured traces, evals, and metrics that help you understand and improve agent behavior over time.
Get hands-on with MLflow and observability tools: Watch live demos of MLflow, tracing integrations, and evaluation workflows for agentic systems.
Connect with other builders: Meet engineers, founders, and practitioners working on similar problems, swap patterns, and find collaborators and potential hires.

Agenda

5:00pm: Registration/Mingling
6:00pm: Welcome Remarks by Jules Damj, Databricks, Staff Developer Advocate
6:15pm: Talk #1 - Building Trustworthy, High-Quality AI Agents with MLflow
6:45pm: Talk #2 - Evaluating AI in Production: A Practical Guide
7:15pm: Mingling with bites + dessert
8:00pm: Night Ends

Speakers

Corey Zumar
- Databricks
- Staff Software Engineer
Mengying Li
- Braintrust
- Head of Data & Product Growth

Session Descriptions

Building Trustworthy, High-Quality AI Agents with MLflow
- Building trustworthy, high-quality agents remains one of the hardest problems in AI today. Even as coding assistants automate parts of the development workflow, evaluating, observing, and improving agent quality is still manual, subjective, and time-consuming.
  
  Teams spend hours “vibe checking” agents, labeling outputs, and debugging failures. But it doesn’t have to be this slow or tedious. In this session, you’ll learn how to use MLflow to automate and accelerate agent observability for quality improvement, applying proven patterns to deliver agents that behave reliably in real-world conditions.
- Key Takeaways and Learnings
  - Understand the development lifecycle of Agent development for better observability
  - Use MLflow key components along the development lifecycle to enhance general observability: tracking and debugging, evaluation with MLflow judges, and a prompt registry for versioning
  - Select appropriately from a suite of over 60+ built-in and custom MLflow judges for evaluation, and use Judge Builder for automatic evaluation.
  - Use MLflow UI to compare and comprehend evaluation scores and metrics
Evaluating AI in Production: A Practical Guide
- Evaluations are essential for shipping reliable AI products, but many teams struggle to move beyond manual testing. In this talk, I'll walk through how to build a production-ready evaluation framework — from choosing the right metrics and creating effective test cases to setting up continuous evaluation pipelines that catch issues before your users do. You'll walk away with practical patterns you can apply right away.

Location

Databricks Inc.

160 Spear St 15th floor, San Francisco, CA 94105, USA

Please head to Floor 14