

Technical AI Safety Demo Day
Join us for a showcase of the top projects from BlueDot's Technical AI Safety Project Sprint.
We'll kick off with lightning talks (2–3 minutes each) where each presenter gives a high-level overview of their project and findings. Then we'll break into smaller groups where you can choose which project you want to explore further — presenters will do a deeper dive into their work and lead a discussion and Q&A.
Projects
Mechanistic Analysis of Chain-of-Thought Faithfulness — Victor Ashioya
Persona Explorer: Using SAEs to Explore Personas and Drift in LLMs — Asude Demir
Sustained Gradient Alignment Mediates Subliminal Learning in a Multi-Step Setting — Chayanon Kitkana
Do models like DeepSeek-R1 Know When They're Being Evaluated? — Kaushik Srivatsan
Interpreting Latent Reasoning in the Depth-Recurrent Transformer — Tristan von Busch