Cover Image for Could Frontier Labs’ Internal Agents Already Go Rogue?
Cover Image for Could Frontier Labs’ Internal Agents Already Go Rogue?
Avatar for Trajectory Labs
Presented by
Trajectory Labs

Could Frontier Labs’ Internal Agents Already Go Rogue?

Registration
Welcome! Please choose your desired ticket type:
About Event

Could an AI company’s internal coding agents create a “rogue deployment”, a set of agents running without human knowledge or permission? In February and March 2026, METR, the organization behind the time horizon graph, conducted a pilot of a process to assess just that. Anthropic, Google DeepMind, Meta, and OpenAI gave us access to their most capable internal LLMs and a wide range of non-public information. We concluded that, while internal agents plausibly had the means, motive, and opportunity to start small rogue deployments, they didn’t have the means to avoid human detection indefinitely.

METR researcher Thomas Broadley explains the process, the six key facts that informed our conclusion, and how we expect risk to evolve over the next few months.

You can watch a livestream of the talk here.

Location
30 Adelaide St E
Toronto, ON M5C 3G8, Canada
Enter the main lobby of the building and let the security staff know you are here for the AI event. You may need to show your RSVP on your phone. You will be directed to the 12th floor where the meetup is held. If you have trouble getting in, give Georgia a call at 519-981-0360.
Avatar for Trajectory Labs
Presented by
Trajectory Labs