

AI Safety Technical Course - week 2 - Linear Probes
This lecture will cover probing and representations in Transformers.
In this lecture:
Understand what linear probes are
See why model outputs are not enough
Explore “truth as geometry”
Connect probes to real safety problems
Understand the limits of linear probes
To receive the certificate, you must complete the notebooks and attend this lecture, as in-person attendance is mandatory.
However, we will stream the lecture online for those who cannot attend in person.
Google Meet Link:
meet.google.com/rux-ehdx-ewb
Drift 23 is accessible through the library.
We'll be serving pizzas and snacks during the lecture.
After the lecture, you're invited to join us for drinks on the house.
Course material is inspired by ARENA, leading UK program in AI Safety.