AI Evals for Government

Luma

GovTech Deutschland

Berlin, Germany

Past Event

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

AI evaluations have become a critical foundation for the responsible use of AI in government, but existing approaches are struggling to keep up. From questions of validity and robustness to the gap between benchmarks and real-world public sector needs, AI evals are under increasing pressure.

In this event, co-hosted by the Centre for Digital Governance at the Hertie School, GovTech Deutschland, and the Weizenbaum Institute, researchers and practitioners explore the practical role AI evaluations take in government today, what’s going wrong with today’s AI evaluation practices, and how they can evolve to meet the strict requirements of the public sector.

Schedule

Doors open: 18:00

Program start: 18:30
- Presentation: Möve: An LLM benchmark for the German Public Sector – Thilo Michael, Senior Data Scientist at Bundesdruckerei
- Fireside Chat: AI Evals in Crisis? With Kenneth Enevoldsen – Researcher and First Author of the "Massive Multilingual Text Embedding Benchmark"
Q&A & Discussion
Program end: 20:00

Location

GovTech Deutschland

Max-Urich-Straße 3, 13355 Berlin, Germany

2nd floor, Town Hall

Presented by

Luma

Hosted By