

AI in Production Meetup @ GetYourGuide
⚡Call for Lightning Talks! We’re opening the stage for short, 5-minute lightning talks. Working on something exciting, experimenting with a new idea, or have lessons learned to share? Bring it along and take the mic — just let us know on arrival or reach out in advance.
💡 Before registering for this event, please read the note about event recording and photography at the end of the event details.
👉 Want to stay connected with us and get more insights about GetYourGuide? Check out our Linktree for all the essentials: dive into our tech blog, explore career opportunities, and respond to our speaker call for future events.*
Welcome to the 9th AI in Production Meetup 🚀
What does it really take to run AI at scale—beyond the demos, the benchmarks, and the hype? Join us for an evening of honest, practical insights from teams shipping AI products to millions of users every day.
This edition zooms in on two of the hardest problems in large-scale discovery: ranking and evaluation. First, we’ll unpack how GetYourGuide tackles extreme item cold-start in a two-sided marketplace, balancing exploration, exposure, and revenue under tight latency budgets. Then, Delivery Hero will show how they use LLMs-as-judges to evaluate search relevance across 70 countries, dozens of languages, and millions of queries per day.
This meetup is built for ML engineers, data scientists, and AI practitioners working on production systems—but anyone curious about applied AI is more than welcome. Come for the talks, stay for the pizza, drinks, and conversations with people building AI that real users depend on.
Talk #1: Solving Marketplace Cold Start at Scale with Ranking
Theodore Meynard, Data Science Manager @ GetYourGuide, LinkedIn
Cold start cripples two‑sided marketplaces: new items lack behavioral signals and social proof, ranking models under‑expose them, which delays the very signals needed to rank them well. This talk shares our journey to break down this loop at GetYourGuide, a marketplace for travel experiences. We evolved our exploration/activation framework over the past three years with three complementary interventions: guaranteed exposure at strategic positions, a real‑time reranker to allocate that exposure efficiently under tight latency budgets, and guardrail boosting for unactivated items when primary assessment slots are empty.
The talk is a pragmatic case study: we’ll show how experiment‑led exploration shaped the system over the last 3 years. We will share what worked, what did not, and how we managed trade-offs between short-term revenue and long-term marketplace health. Attendees will leave with a blueprint for safely accelerating early traction in their own marketplaces, combining learning‑to‑rank with exposure guarantees without sacrificing overall business health.
Talk #2: Judging Search at Global Scale: Using LLMs to Evaluate Relevance Across Languages and Cultures
Aleksandra Kovachev Senior Data Science Manager @ Delivery Hero, LinkedIn
Imagine the diversity of the data we work with at Delivery Hero. Every day, we deliver millions of orders across 70 countries on 4 continents, giving us a unique view into global food and grocery habits in dozens of languages. As the parent company of brands like Talabat (MENA), Foodpanda (APAC), and Woowa (Korea) and others, we operate many central services from Berlin—Search being one of the most impactful, driving a significant share of our GMV.
Ensuring highly relevant search results is critical, yet measuring relevancy in a fast-changing, multilingual environment is challenging. Manual evaluations have proven slow, inconsistent, and dependent on local expertise. With recent advances in LLMs—their broad knowledge, language coverage, and scalability—we shifted to using LLMs as judges of search results relevancy.
In this talk, we’ll show how we built a scalable LLM-as-a-Judge system for continuous relevancy evaluation, enhanced with human alignment, failure-mode detection, and feedback loops. This setup enables accurate relevance reporting across millions of search results for restaurants, shops, and groceries. We’ll dive into the challenges, key learnings, and frameworks behind the system. Our broader goal is to extend LLM-as-a-Judge to additional use cases—such as query tagging, classification, and translation—improving every step of the user’s search journey.
Schedule of the Meetup:
18:00 Doors open + Open Networking
*Please note that doors will be closing at 18:50.18:30 Introduction to GetYourGuide
18:40 Talk # 1: Solving Marketplace Cold Start at Scale with Ranking
19:10 Break
19:30 Lightning Talk Session
19:40 Talk # 2: Judging Search at Global Scale: Using LLMs to Evaluate Relevance Across Languages and Cultures
20:15 Raffle (5 min) + Open Networking
21:00 End
ℹ️ The meetup begins at 18:00. Doors will remain open until 18:50, after which entry won’t be possible. We appreciate your understanding and look forward to welcoming you!
Answers for the curious:
Who should join the meetup?
This meetup is dedicated to anyone who works with, studies or is interested in ML and AI. It's a fun and informal environment where we can learn from each other's experiences.
Can I bring a plus one?
Sure, that's totally fine. Just ensure to register them as well.
Can I bring a dog?
Unfortunately, we don't allow dogs at this event.
For any questions, please don't hesitate to contact us.
Important note: Please be advised that this event will be recorded and photographed, and we will have a photographer on-site. If you prefer not to be included in any recordings or photographs, please do not hesitate to let us know during the event. Your comfort and privacy are important to us.