Cover Image for AI x Cyber Reading Group
Cover Image for AI x Cyber Reading Group
Avatar for BlueDot Impact
Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]

AI x Cyber Reading Group

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

This week we are discussing a very recent paper from UK AISI on measuring how far AI agents can go in realistic, multi-step cyber attack scenarios. Instead of toy tasks, the paper drops models into simulated enterprise and ICS environments and evaluates how well they can execute long attack chains end-to-end. The results show clear progress with scaling and newer models, but also highlight major limitations: performance is still partial, highly dependent on token budgets, and tested in environments without active defenses or defenders. In other words, it’s a great step toward realism, but still far from representing real-world operations. As you read, it’s worth thinking about what the biggest gaps are before capabilities becomes operationally meaningful. Are we "there" yet? If not, what would it take and how would we know?

Link to paper: https://www.aisi.gov.uk/research/measuring-ai-agents-progress-on-multi-step-cyber-attack-scenarios

Avatar for BlueDot Impact
Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]