AI Research Circle: AI Alignment, Unpacked

Name: AI Research Circle: AI Alignment, Unpacked
Start: 2026-03-16T19:00:00.000-07:00
End: 2026-03-16T20:45:00.000-07:00
Location: 550 Laguna St, San Francisco + Full Studio

Rally SF

550 Laguna St, San Francisco + Full Studio

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

When Anthropic built Claude, they didn't just write a list of rules. They wrote a constitution.

The detailed 84 page document covers everything from safety priorities to how Claude should handle moral disagreements. But you don't need to read it cover to cover. The structure is clear, the writing is plain language, and even skimming the table of contents raises real questions about what we're encoding into AI systems.

This session, we'll use Claude's constitution as our anchor text and explore the broader landscape of AI alignment:

The constitution itself. What's actually in it? What assumptions does it encode? What's missing?

Rules vs. dispositions. Is the current framing too rule-based? What would it look like to free an agent of bad dispositions rather than constrain it into good behavior?

Constitutional AI as a technique. How does it work under the hood, and how does it compare to other alignment approaches like RLHF?

Co-facilitated by Anup Gosavi and Emily Hough-Kovacs.

Pre-read: Anthropic's Claude Constitution: https://www.anthropic.com/constitution

Optional supplemental reading:

Bai et al., "Constitutional AI: Harmlessness from AI Feedback" (2022): https://arxiv.org/abs/2212.08073
Dung & Mai, "AI Alignment Strategies from a Risk Perspective" (2024)
Esther An, "Towards Principled AI Alignment: An Evaluation and Augmentation of Inverse Constitutional AI" (2025)

About the AI Research Circle A community gathering at The Commons where we explore AI research together. No research background required. Just curiosity. Each session, we pick a paper or topic, break it down, and open it up for discussion. The goal: make cutting-edge ideas accessible and spark conversation across disciplines.

Location

550 Laguna St, San Francisco + Full Studio

Presented by

Rally SF

San Francisco events worth showing up for.

Hosted By

63 Went

AI