

(POSTPONED TO NOVEMBER 14) F2025 - Technical Paper Reading Group Week 4 - Mechanistic Interpretability
UBC AI Safety Technical Paper Reading Group
UBC AI Safety has launched a biweekly technical paper reading group focused on cutting-edge AI safety research.
Sessions will engage with recent papers across topics including mechanistic interpretability, AI control, scalable oversight, capability evaluation, and failure mode identification. The group emphasizes critical analysis and discussion.
Session 4: Mechanistic Interpretability
Location: IKB 265
Mechanistic interpretability research seeks to develop techniques to reverse engineer the computational circuits in neural networks. In this session, we'll sample some papers from the frontier of the field, covering recent successes and open challenges. We'll ask: Can this scale? Does it matter? What would "enough" interpretability look like? Dinner will be provided!
Prereading:
This primer covers everything you need to know for this session. We highly recommend reading it before attending!
Who Should Attend:
Meetings are open to anyone interested in technical AI safety research. While no prior experience is required, participants with working knowledge of AI Safety and machine learning concepts will get the most out of discussions. If you're unsure whether you have sufficient background, check out this preparation document which gives resources on topics you should be familiar with for maximum engagement with the material.