Cover Image for G-Body Meeting 10/22
Cover Image for G-Body Meeting 10/22
Avatar for Cornell AI Alignment
A community of students and researchers conducting research and outreach to mitigate risks from advanced AI systems.
Hosted By
18 Went
Registration
Past Event
Welcome! To join the event, please register below.
About Event

This week we will read and discuss a paper on Agentic Misalignment from Anthropic: https://www.anthropic.com/research/agentic-misalignment.

In this paper, Claude and other models blackmailed a company executive, and in another setting locked the executive in a room with lethal oxygen level and temperature, to avoid being shut down. This empirical study provides a compelling introduction to AI safety challenges and what misaligned AI systems could look like in practice.

This meeting will be led by Jonathn Chang ([email protected]), 2nd-year PhD student in Applied Mathematics researching AI alignment, and Jinzhou Wu ([email protected]), sophomore researching AI alignment and alum of CAIA's Intro to Alignment Fellowship.

Location
Bowers Hall, Room 324 (Jaeger Meeting Room, new CIS building)
Avatar for Cornell AI Alignment
A community of students and researchers conducting research and outreach to mitigate risks from advanced AI systems.
Hosted By
18 Went