Cover Image for W2: Mastering AI/LLM Evaluations

Presented by

Cloud Shuttle is a Sydney-based data and AI consultancy founded by Peter Hanssens. Known for rapid delivery, vendor-agnostic advice, and deep hands-on expertise, Cloud Shuttle serves SMBs, startups.

Hosted By

AI

W2: Mastering AI/LLM Evaluations

Name: W2: Mastering AI/LLM Evaluations
Start: 2026-06-23T09:00:00.000+10:00
End: 2026-06-23T17:00:00.000+10:00
Location: Stone & Chalk Tech Central

Cloud Shuttle

Stone & Chalk Tech Central

Haymarket, Australia

Past Event

Welcome! Please choose your desired ticket type:

You will be asked to verify token ownership with your wallet.

About Event

You've shipped an AI feature. How do you know it's working? How do you catch regressions before your users do? How do you compare two prompt versions without guessing?

Most teams skip this step entirely. It's why most AI systems feel unreliable. This workshop fixes that.

What we'll cover:

→ Why standard metrics (BLEU, ROUGE, accuracy) break for generative AI — and what to use instead

→ Building golden datasets: how to collect, curate, and version your test cases

→ Programmatic evals vs LLM-as-a-Judge — when to use each and how to combine them

→ Writing judge prompts that align with human judgment

→ A/B testing prompts and models at production scale

→ Catching regressions before they reach users

→ Offline vs online evaluation, tracing, and production monitoring

→ Tools: DeepEval, LangSmith patterns, Braintrust-style logging — all accessible, no enterprise budget required

You'll leave with:

A working eval suite — built during the workshop, directly applicable to your own AI project. Judge prompt templates, dataset structures, and a checklist for production evals.

Who this is for:

AI engineers, product teams shipping AI features, ML practitioners, and anyone who needs to know their AI system actually works reliably. Basic familiarity with LLM concepts helpful. Coding not required for most sessions — Colab notebooks provided.

📅 Tuesday 23 June 2026 🕘 9:00am – 5:00pm

📍 Stone & Chalk Tech Central, Haymarket

🎟 Early bird pricing closes 22 May — price increases after.

💬 DataEngBytes member? Use code DEB10 at checkout for 10% off.

Bring a laptop. Lunch and materials included.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 🗓 PART OF AI MASTERY WEEK — 22–26 JUNE

Mon 22 → W1: Prompt Engineering Mastery

Tue 23 → W2: Mastering AI/LLM Evaluations ← you're here

Wed 24 → W3: Building Practical AI Agents

Thu–Fri → W4: Enterprise AI Architecture

🎁 Book all four and save 15%: AI Mastery Week Bundle

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Hosted by Peter Hanssens, founder of Cloud Shuttle and DataEngBytes — ANZ's largest data engineering community conference.

Location

Stone & Chalk Tech Central

Level 1/477 Pitt St, Haymarket NSW 2000, Australia

Presented by

Cloud Shuttle

Cloud Shuttle is a Sydney-based data and AI consultancy founded by Peter Hanssens. Known for rapid delivery, vendor-agnostic advice, and deep hands-on expertise, Cloud Shuttle serves SMBs, startups.

Hosted By

AI