Cover Image for AI Safety Evals - Paper Reading Club
Cover Image for AI Safety Evals - Paper Reading Club
Avatar for BlueDot Impact
Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]

AI Safety Evals - Paper Reading Club

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

This week we have our second author presentation in a row! Hongye Cao will present his research on SafeDialBench: A Fine-Grained Safety Evaluation Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks, a step beyond most safety benchmarks, which focus on single jailbreak attacks in single-turn dialogs.

​Every week, someone will present for up to 20 minutes followed by 40 minutes of discussion. RSVP to join, sign up to present, or contact us at [email protected] with questions. Everyone is welcome!

Avatar for BlueDot Impact
Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]