

Presented by
BlueDot Impact
AI Safety Evals - Paper Reading Club
Registration
About Event
Join us for our final author presentation on our situational awareness theme: Joshua Fonseca Rivera will present his paper Steering Awareness: Detecting Activation Steering from Within, on LLMs' ability to detect injected steering vectors, and the implications of this ability for safety research.
Every week, someone will present for up to 20 minutes followed by 40 minutes of discussion. RSVP to join, sign up to present, or contact us at [email protected] with questions. Everyone is welcome!
Presented by
BlueDot Impact