Cover Image for Roundtable Dinner: AI Benchmarking Across Languages hosted by AI Circle & LILT
Cover Image for Roundtable Dinner: AI Benchmarking Across Languages hosted by AI Circle & LILT
Avatar for LILT
Presented by
LILT

Roundtable Dinner: AI Benchmarking Across Languages hosted by AI Circle & LILT

Register to See Address
New York, NY
Registration
2 Spots Remaining
Hurry up and register before the event fills up!
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

A Deep Dive with Spence Green, LILT CEO, and AI Circle in Park Avenue.

As AI transitions from static chatbots to autonomous agents capable of multi-step reasoning and tool use, we have hit a critical wall: the English-Centric Evaluation Gap. Most multilingual benchmarks today are built on "translated" versions of English datasets—a process that introduces noise, hallucinations, and "translationese" that makes tasks impossible for even the most capable agents to solve.

In this session, Spence will lead a technical discussion on the AI Benchmarking across languages and how LILT is redefining what it means to benchmark agentic performance at the frontier.

What We’ll Discuss:

  • The "Fluent yet Broken" Paradox: Why a translation can be grammatically perfect yet functionally flawed if tool behaviors, locale conventions, or cultural contexts are lost.

  • GAIA-v2-LILT: A breakdown of how re-auditing the GAIA benchmark recovered an average of +20.7 percentage points in measured performance—proving that current "capability gaps" are often just measurement errors.

  • Terminal-Bench & tau(3)-bench: Evaluating agentic coding and multi-turn customer support conversations in non-English environments.

  • Functional and Cultural Alignment: What are the key requirements and pitfalls when transforming English benchmarks into other languages?


​The Experience

We are pairing a seasonally seafood-forward menus designed by Executive Chef Andy Kitko experience with a structured technical "Engagement."

6:00 PM The Warm Up - Cocktails, arrivals, and networking.
6:30 PM The Thesis - Opening note AI Benchmarking
7:00 PM Dinner - A curated roundtable. "Bouncers" will be served with each course to drive deep-dive debate.

About LILT:
LILT is the only AI-native multilingual solution for frontier AI data and enterprise localisation. We help make your data and content multilingual—faster, more accurately, securely, and at scale. Specialising in language-grounded alignment and multimodal evaluation, we provide research-grade expertise to govern AI systems. Unlike crowdsourced options, our curated expert network and continuous quality calibration provides high-fidelity signals to build reliable models ready for global deployment. Learn more about LILT.

AI Circle is a community of practitioners across research and deployment that are advancing the frontier of AI. Our members come from frontier AI native startups through large enterprises. We host events at our chapters in SF, NYC, Seattle, London, Paris, MIT and Stanford. Learn more about AI Circle.

Location
Please register to see the exact location of this event.
New York, NY
Avatar for LILT
Presented by
LILT