

Real-World Challenges in Geospatial Machine Learning
Join us at TomTom’s Amsterdam office on Thursday, July 9, for an evening dedicated to the practical challenges of building geospatial machine learning systems.
Machine learning is a powerful tool, but applying it to real-world geospatial data presents a unique set of challenges. In this meetup, we are bringing together research and engineering perspectives to discuss how to overcome these hurdles, design robust pipelines, and maintain data quality in a rapidly changing world.
Agenda
17:30 - 18:25: Welcome with food and drinks!
18:25 - 18:30: TomTom’s intro
18:30 - 19:00: Talk 1: Designing ML systems for a continuously changing world, or how does Automated Driving really work, by Ahmed Boudissa
19:00 - 19:35: Talk 2: 85 million matches, thousands of catches: active learning at production scale, by Haris Iqbal
19:35 - 19:45: Break
19:45 - 20:15: Talk 3, by Ioanna Micha & Nathalie Dees
20:15 - 21:00: Networking / drinks
Talk 1: Designing ML systems for a continuously changing world, or how does Automated Driving really work, by Ahmed Boudissa
Building machine learning systems is one thing. Building systems that remain accurate, fresh, and scalable while the real world changes every day is another challenge entirely.
In this talk, we'll explore how TomTom turns billions of raw observations from satellites, survey vehicles, connected devices, and onboard sensors into lane-level map products used by automated driving systems. Rather than focusing on a single model, we'll look at the broader system design challenges involved in combining multiple data sources, maintaining data quality, and keeping machine learning pipelines operating at global scale.
You'll see how foundational models help automate map production, how multi-source pipelines continuously update map content, and what it takes to transform fragmented real-world data into a reliable product that can be trusted by both drivers and automated systems.
What you'll learn:
• Why large-scale, real-world ML systems are fundamentally different from controlled benchmarks
• How multiple data sources can be combined into a single production workflow
• How machine learning and computer vision automate complex mapping tasks
• The challenges of maintaining freshness and quality at global scale
• Why production system design matters as much as model performance
Talk 2: 85 million matches, thousands of catches: active learning at production scale, by Haris Iqbal
Many machine learning problems look straightforward until you encounter the edge cases. The first 80% of accuracy often comes easily. The remaining percent is where the most interesting engineering decisions begin.
In this talk, we'll explore a large-scale matching problem from digital map-making: connecting traffic signs to the correct road segments. While simple nearest-neighbour approaches solve most cases, real-world data introduces ambiguity, inconsistencies, and edge cases that quickly expose the limitations of naive solutions.
Using this problem as a case study, we'll examine how model selection, feature engineering, and active learning can be combined to build scalable systems that perform reliably in production. You'll see why we chose tree-based models over more complex alternatives, how active learning helped reduce labeling effort across 85 million candidate matches, and how model failures became one of our most valuable sources of improvement.
What you'll learn:
• Why seemingly simple ML problems become challenging at scale
• When classical machine learning can outperform more complex approaches
• How to design features and select models under real-world scalability constraints
• How active learning can dramatically reduce labeling effort
• Practical techniques for debugging and improving production ML systems
Talk 3: TBD, by Ioanna Micha & Nathalie Dees
TBD
Directions:
📍 De Ruijterkade 154, 1011 AC - Amsterdam
TomTom’s office is just a 10 to 15-minute walk (or quick bus ride) from Amsterdam Central Station.