

Could Frontier Labs’ Internal Agents Already Go Rogue?
Could an AI company’s internal coding agents create a “rogue deployment”, a set of agents running without human knowledge or permission? In February and March 2026, METR, the organization behind the time horizon graph, conducted a pilot of a process to assess just that. Anthropic, Google DeepMind, Meta, and OpenAI gave us access to their most capable internal LLMs and a wide range of non-public information. We concluded that, while internal agents plausibly had the means, motive, and opportunity to start small rogue deployments, they didn’t have the means to avoid human detection indefinitely.
METR researcher Thomas Broadley explains the process, the six key facts that informed our conclusion, and how we expect risk to evolve over the next few months.
You can watch a livestream of the talk here.