EvalOps unfiltered #2: Evaluating LLM-based applications

Name: EvalOps unfiltered #2: Evaluating LLM-based applications
Start: 2026-06-17T18:00:00.000+02:00
End: 2026-06-17T20:00:00.000+02:00
Location: Berlin, Germany

Rhesis AI

Berlin, Germany

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

LLM applications & agents behave differently with the slightest prompt tweak, context change, or input variation. If you're building anything real with LLMs, you already know the outputs can surprise you: and not in a good way. That's why you test. Full stop. EvalOps Unfiltered is a practical event series for AI teams tackling the real-world challenges of evaluating LLM & agentic applications. Focused on the emerging field of EvalOps, it goes beyond benchmarks to address unpredictable model behavior, adversarial risks, and production readiness. Sessions feature speakers sharing what worked for them and what they learned the hard way, followed by breakout discussions and honest conversations about what truly works when deploying LLM apps and agents.

What to expect (on 17. June 2026):
- Doors open at 17:30.
- Event starts at 18:00.

🔧 Lightning talks from four builders presenting their evaluation & testing challenges:

Matthias König, Founder & CEO @ KvJ Consulting: "How to build AI testing guidance that actually works (and is compliant)?"
Rouven Glauert, Founder @ Lelia (prev. Senior Applied Scientist @ Parloa): "You rerun your simulations - your results don't hold. Now what?"
Mirko Knaak, AI Leader and AI Advisor @ IAV GmbH: "AI drives outcomes – Safety & Security ensure they last"
Neha Khalwadekar, AI Product Manager @ LCA: "Schrödinger's Eval: It Passes and Fails Until a User Opens It"

🧠 Breakout sessions where you'll dig deep into one challenge, discuss solutions, and share experiences with fellow builders.

🍺 Drinks & 🌮 snacks while the conversations continue! No panels, no pitches: just builders sharing what's actually broken and collaborating on what might work. This isn't about theory. It's about the unglamorous, critical work of making LLM & agentic applications reliable for the real world.

Location: Berlin, Germany; more details upon registration.

Target Audience:

AI engineers wrestling with evaluation pre-release
Technical leads managing LLM-powered products
Data scientists designing and fine-tuning LLM-based applications
Product owners responsible for delivering reliable LLM-driven features

Please note: Attending the event is only possible upon confirmed registration.

Rhesis AI (www.rhesis.ai) proudly hosts this event in collaboration with AI NATION (https://www.ai-nation.de).

Location

Please register to see the exact location of this event.

Berlin, Germany

Presented by

Rhesis AI

Hosted By