AI Evaluations - a technical primer workshop
This workshop is part of the How to Think about Tech? The Case of 'AI Safety' study group initiated by some of the fellow candidates of the 2025/2026 Introduction to Political Technology course. It is open to faculty and fellowship candidates only - external guests by request.
AI companies release new models constantly, often claiming each new one is "safer" than the last. But how do they actually know? What tests do they run? What gets measured and evaluated?
This workshop provides a technical primer on AI evaluations (evals) - the practice of testing AI systems for capabilities, safety, and fairness. We'll explore:
What evaluations are and why they matter for AI governance
Different types of evaluations: capability, alignment, safety, and societal impact assessments
How evals work technically: constructing datasets, defining metrics, and designing evaluation protocols
Examples that demonstrate the mechanics in practice
(Optional) A deep dive on a real evaluation framework to see how it works in deployment decisions
We'll close with a hands-on activity: designing evaluations for a specific scenario to surface the tradeoffs between what we can measure and what actually matters.
This workshop pairs with the AI Safety study group discussion on Feb 22 on AI Evals (https://luma.com/357o5glx) examining the governance and political implications of evaluation frameworks.