Cover Image for AI Safety Thursday: Beyond Adversarial Robustness - Rethinking Sociopolitical Safety in AI Systems

Presented by

Catalyzing Toronto's role in steering AI progress toward a future of human flourishing. Join us for a variety of events on technical AI safety, governance in a world of advanced AI, and more.

Hosted By

14 Going

人工智能

AI Safety Thursday: Beyond Adversarial Robustness - Rethinking Sociopolitical Safety in AI Systems

Name: AI Safety Thursday: Beyond Adversarial Robustness - Rethinking Sociopolitical Safety in AI Systems
Start: 2026-01-22T18:00:00.000-05:00
End: 2026-01-22T21:00:00.000-05:00
Location: 30 Adelaide St E

Trajectory Labs

30 Adelaide St E

Toronto, Ontario

Welcome! Please choose your desired ticket type:

You will be asked to verify token ownership with your wallet.

About Event

Adversarial robustness remains a key concern in AI safety, with many interventions focusing on mitigating models’ capabilities to assist in harmful or criminal tasks. But how do LLMs behave in sociopolitical contexts, especially when faced with ambiguity?

Punya Syon Pandey will discuss research on accidental vulnerabilities induced by fine-tuning, and introduce new methods to measure sociopolitical robustness, highlighting broader implications for safe societal integration.

Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions

If you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.

This is part of our weekly AI Safety Thursdays series. Join us in examining questions like:

How do we ensure AI systems are aligned with human interests?
How do we measure and mitigate potential risks from advanced AI systems?
What does safer AI development look like?

Location