Cover Image for BLISS x Semrush Workshop: Operating Agentic AI in Production

Hosted By

BLISS x Semrush Workshop: Operating Agentic AI in Production

Hosted by BLISS Berlin

Berlin, Germany

Registration Closed

This event is not currently taking registrations. You may contact the host or subscribe to receive updates.

About Event

We are excited to invite you to an interactive workshop hosted by BLISS x Semrush and led by Saeid Nobakht, Lead AI Engineer at Semrush, focused on bringing agentic AI into production.

Title: Operating Agentic AI in Production: Evaluation, Observability, and Reliability
📅 Date: 16th April
🕕 Time: 18:00
📍 Location: Marchstrasse [Full location at registration]

The session will last around 2 hours, followed by a networking session with Semrush and fellow AI enthusiasts (and free pizza 🍴😋🍕!). Bring your laptop!

Please arrive a bit early to get settled:)

Schedule:

17:45 - Doors Open
18:00 - Start of Workshop
20:00 - Networking + Catering

Abstract: Building reliable agentic systems requires more than evaluating individual LLM outputs — it demands a systems-level approach to testing, observability, and operations. This session covers how to design evaluation frameworks for multi-step, tool-using agents; how to instrument and trace agent decisions for debugging and audit; and how to recognize and respond to common production failure modes. A real-world case study ties the concepts together, with an optional hands-on segment where participants analyze a sample trace to identify evaluation gaps and operational risks.

Session Outline

The Production Reality of Agentic Systems
- Why LLM evaluation alone is insufficient for system evaluation
- Challenges of multi-step reasoning, tool usage, and distributed systems
- Common failure modes observed in real deployments
Evaluation Framework for Agentic Systems
- Task-level success metrics and end-to-end completion rates
- Tool invocation correctness and parameter extraction
- Policy and safety violation detection
- Latency and cost constraints
- Regression testing for agent workflows
Observability and Tracing
- Designing structured agent traces
- Capturing prompts, tool calls, outputs, and decision points
- Reconstructing agent decisions for audit and debugging
- Detecting loop behavior and model drift
Incident Patterns and Operational Playbooks
- Runaway tool loops
- Partial failures and compensating actions
- Permission mismatches
- Vendor rate limits and retry strategies
- Safe fallback and graceful degradation
Case Study Walkthrough
- How was the evaluation implemented in a real-world workflow
- What broke in production and why
- How observability improvements increased reliability

Who is this event for?

This workshop is open to everyone curious and willing to build with agentic AI. If you have a technical background and can work with Python, you're good to go!

We are BLISS e.V., Berlin’s AI community connecting like-minded individuals passionate about machine learning and data science. Our BLISS Workshops connect students and young professionals with industry partners, offering an inside look into how machine learning is applied in real-world settings - from research and development to deployment.

BLISS Website: https://bliss.berlin

BLISS Youtube: https://www.youtube.com/@bliss.ev.berlin

Location

Please register to see the exact location of this event.

Berlin, Germany

Hosted By