Observability & Evaluation for LLM Apps and Agentic AI with Langfuse
Shipping an LLM app is easy. Knowing whether it's actually working is hard.
Unlike traditional software, LLM systems and agentic pipelines can fail silently — running without errors while producing wrong or degraded answers. With agents making multi-step decisions across tools and APIs, a single bad output can cascade through your entire system before anyone notices.
This hands-on workshop shows you how Langfuse gives your team the visibility and control to ship AI with confidence. You'll go from a blind prototype to a fully observable system — seeing exactly what your app costs, where quality drops, and which prompt changes actually improve things.
For agentic systems, Langfuse traces every step of an agent's decision-making. When something goes wrong, you'll know exactly why — rather than losing hours to guesswork. The Prompt Playground and LLM-as-a-judge evals mean faster iteration with less manual effort.
What you'll learn
How to instrument LLM apps and agentic pipelines with Langfuse traces
How to measure cost, latency, and quality across your system
How to run structured evals — including LLM-as-a-judge — to know if your changes are actually working
How tools like Cursor and Claude Code can tap into Langfuse mid-build and keep iterating without waiting on you
Prerequisites Python 3.10+, laptop. OpenAI API key optional. Langfuse account set up on the day (free tier).
This workshop is included with your DataEngBytes Sydney 2026 conference ticket. If you don't have a ticket yet, grab one at dataengbytes.com/2026/sydney.
About your host
Muhammad Ali is an AI Engineer and Solutions Architect at ClickHouse, and the Langfuse Lead for the APJ region. Over the past three years he has designed AI applications for Apple, Atlassian, and Amazon — with a focus on the AI development lifecycle and making LLM systems observable and reliable. Previously Principal Analytics Tech Lead (APJ) at AWS.
Organiser: DataEngBytes Part of: DataEngBytes Sydney 2026 — July 28–29, The Collider, 477 Pitt Street, Tech Central Host: Muhammad Ali (ClickHouse / Langfuse APJ)