Evaluating AI in Practice: Bridging Statistical Rigor, Sociotechnical Insights, and Ethical Boundaries
Hosted by Eval Eval, UK AISI, and UC San Diego (UCSD)
We’re excited to announce the upcoming Evaluating AI in Practice workshop, happening on December 8, 2025, in San Diego.
This full-day event will explore how to evaluate AI systems responsibly and effectively by bridging three essential dimensions:
Statistical Methods – Techniques to quantify uncertainty, aggregate evaluation data, and estimate latent model capabilities while ensuring reliability and validity.
Sociotechnical Perspectives – Understanding task selection, societal impacts, and the implications of evaluation results for downstream applications.
Evaluating Evaluation Results – Translating evaluation outcomes into meaningful insights about model capabilities, risks, and potential downstream impacts.
The workshop will feature a keynote by Stella Biderman, followed by interactive sessions designed to connect technical methods with broader ethical and social considerations in AI evaluation.
Further details, including the full program, exact location, and additional speakers, will be announced soon!
Space is limited; please register at your earliest convenience!
