Cover Image for AI Research Circle [members and +1s]
Cover Image for AI Research Circle [members and +1s]
Avatar for The Commons
Presented by
The Commons
The Commons Member Calendar • If you would like to join The Commons, apply to be a member at https://www.thesfcommons.com/
16 Going

AI Research Circle [members and +1s]

Registration
Welcome! To join the event, please register below.
About Event

About the AI Research Circle

The AI Research Circle is a community gathering at The Commons where we explore and discuss AI research papers together. You don’t need to be a researcher—just bring curiosity and an interest in the field.

Each session, we choose a paper, break it down into plain language, and dive into open conversation. The goal is to make cutting-edge ideas accessible, spark thoughtful debate, and connect across disciplines.

Session Details

Paper: Persona Vectors: Monitoring and Controlling Character Traits in Language Models (Chen et al., 2025) 

Large language models often present as a single “assistant” persona—but in practice, their personality can shift in surprising (and sometimes undesirable) ways due to prompting, fine-tuning, or training data. This paper introduces persona vectors: linear directions in a model’s activation space that correspond to specific character traits like evil, sycophancy, and hallucination propensity.

We’ll use this paper as a jumping-off point to talk about what “personality” even means for LLMs, how linear directions emerge in activation space, and what this implies for alignment, safety, and tooling for model developers.

Reading

Primary (please read if you can):

Who should join

Anyone interested in:

  • How “personality” and “traits” show up in LLM behavior

  • Mechanistic-ish tools for interpreting and steering models

  • Practical alignment questions around fine-tuning, safety, and data pipelines

No formal background in interpretability or alignment required—we’ll aim to keep things intuitive and conversational while still engaging for people who read a lot of papers.

Location
550 Laguna St, San Francisco + Full Studio
Avatar for The Commons
Presented by
The Commons
The Commons Member Calendar • If you would like to join The Commons, apply to be a member at https://www.thesfcommons.com/
16 Going