We’re building the workforce needed to safely navigate AGI.   

Contact: team@bluedot.org

BlueDot Impact

Training large language models on narrow tasks can lead to broad misalignment

, one of the foundational studies of emergent misalignment, a counterintuitive and fascinating finding in alignment research.

​Every week, someone will present for up to 20 minutes followed by 40 minutes of discussion. RSVP to join, 

, or contact us at evalsreadinggroup@gmail.com with questions. Everyone is welcome!

AI Safety Evals - Paper Reading Club