We’re building the workforce needed to safely navigate AGI.   

Contact: team@bluedot.org

BlueDot Impact

This week we have our second author presentation in a row! Hongye Cao will present his research on 

SafeDialBench: A Fine-Grained Safety Evaluation Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks

, a step beyond most safety benchmarks, which focus on single jailbreak attacks in single-turn dialogs.

​Every week, someone will present for up to 20 minutes followed by 40 minutes of discussion. RSVP to join, 

, or contact us at evalsreadinggroup@gmail.com with questions. Everyone is welcome!

AI Safety Evals - Paper Reading Club