Cover Image for AI Safety Thursday: Chain-of-Thought Monitoring for AI Control
Cover Image for AI Safety Thursday: Chain-of-Thought Monitoring for AI Control
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
5 Going

AI Safety Thursday: Chain-of-Thought Monitoring for AI Control

Get Tickets
Welcome! Please choose your desired ticket type:
About Event

Modern reasoning models do a lot of thinking in natural language before producing their outputs. Can we catch misbehaviors by our LLMs and interpret their motivations simply by reading these chains of thought?

In this talk, Rauno Arike and Rohan Subramani will give an overview of research areas in chain-of-thought monitorability and AI control, and discuss their recent research on the usefulness of chain-of-thought monitoring for ensuring that LLM agents only pursue objectives that their developers intended them to follow.

Event Schedule
6:00 to 6:30 - Food & Networking
6:30 to 7:30 - Main Presentation & Questions
7:30 to 9:00 - Breakout Discussions

​​​If you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.

Location
30 Adelaide St E 12th floor
Toronto, ON M5C 3G8, Canada
Enter the main lobby of the building and let the security staff know you are here for the AI event. You may need to show your RSVP on your phone. You will be directed to the 12th floor where the meetup is held. If you have trouble getting in, give Georgia a call at 519-981-0360.
Avatar for Trajectory Labs
Presented by
Trajectory Labs
Hosted By
5 Going