

Foresight's AI Salon: Georg Lange | Tracing LLMs' thoughts to detect deception, intent, and misalignment
Join Foresight Institute and Georg Lange on his session entitled Tracing LLMs' thoughts to detect deception, intent, and misalignment
Schedule
6 - 7 pm: Drinks & Social
7 - 8 pm: Talk, Q&A, & Breakouts
8 - 9 pm: Mingle & Hang
Speaker Bio: "I’m an independent researcher working on Mechanistic Interpretability for LLMs. I work on circuit tracing, dictionary learning, and automated interpretability to create technology broadly useful for detecting deception, intent, and misalignment.
I was a MATS scholar and worked with Alex Makelov and Neel Nanda on Sparse Autoencoders and Distributed Alignment Search for feature detection and subspace activation patching. Previously, I was a MSc AI student at the University of Amsterdam, where I worked on brain-like interpretable spatiotemporal Computer Vision models, supervised by Prof Iris Groen and Amber Brands.
Further, I was a Fulbright student at the Graduate Center, CUNY, where I worked on Reinforcement Learning, Decision Making, and Reward Sensitization and conducted fiber photometry experiments in the Nucleus Accumbens of mice, supervised by Prof Jeff Beeler."
Salon discussions are held under Chatham House Rule (don’t connect people to ideas when discussing them outside of this workshop).
Foresight’s AI Nodes offer grant funding, local compute and community hubs for AI for Science and Safety projects in Berlin and the Bay Area.