Unsupervised Learning 2.0 @ Constellation

Hosted by Isaac Dunn, Sasha Berezhnoi & Henry Sleight

Berkeley, California

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

We're hosting Unsupervised Learning again — a day to explore important recent developments in frontier AI safety. (You will be doing the unsupervised learning, not the models!)

For confirmed speakers so far, see the schedule below.

Everything is optional, so drop in and out of sessions as you find useful. We’ll also have plenty of time for 1:1 meetings between people working on similar projects, and space to catch up on work and relax.

This is an invite-only event for researchers from frontier AI companies and some independent research orgs. Know someone interested in coming? They can request to come on Luma.

Register here soon to secure your spot. Late RSVPs are welcome, though we may not be able to accommodate everyone – reach out to [email protected] if you have questions.

Schedule (All Optional)

9:00 — Doors open for breakfast, conversations, and co-working

11:30 — Lightning talks on the future of misalignment

Including lightning talks from:

Buck Shlegeris, Redwood Research
Alex Turner, Google Deepmind
Holden Karnofsky, Anthropic
Hjalmar Wijk, METR

12:30 — Lunch

1:30 — Optional research discussions

Opt-in short talks and discussion sessions throughout the afternoon — with space for conversations, 1:1s, co-working, and relaxing.

Stress testing deliberative alignment for anti-scheming training
- Jenny Nitishinskaya, OpenAI
Q&A on alignment evaluation and misalignment safety cases
- Sam Bowman, Anthropic
Q&A with John Schulman (Thinking Machines)
AI resilience
- Wojciech Zaremba, OpenAI
What should the spec be for very capable models?
- Jason Wolfe, OpenAI
- Joe Carlsmith, Anthropic
Natural misalignment from reward hacking in production RL
- Evan Hubinger, Anthropic
Extrapolating METR's agent time horizon graph
- Nikola Jurkovic, METR
Improving alignment by shaping generalization
- Sam Marks, Anthropic
Preventing reward-compatible misalignment: randomisation games for reinforcement learning
- Jacob Pfau, UK AISI
A mainline plan for mitigating misalignment risk
- Ryan Greenblatt, Redwood Research
Believe it or not: how deeply do LLMs believe implanted facts?
- Stewart Slocum, xAI (work done at Anthropic)

6:00 — Dinner

You’re welcome to stay as long as you like for dinner and more conversations.

Where?

Sessions will be hosted in Constellation’s workspace in Downtown Berkeley. Meals, snacks, drinks, and workspace amenities (workstations, call booths, meeting rooms) available to all attendees.

The space will be open from 9:00 until late for conversations and co-working between sessions. Specialized workspace needs are available upon request (contact Isaac, [email protected]).

Who Should Attend?

We welcome people who work on frontier AI safety, or work at a frontier AI company and are interested in safety. We value thoughtful critics and varied perspectives as an important part of a robust truth-seeking culture – asking challenging questions helps shape productive discussions.

Very cool people, good vibes, great conversations
— Adrien Ecoffet, OpenAI

What is Constellation?

We are an independent, nonprofit research center hosting over 200 individual researchers and organizations working on frontier problems in AI safety. You'll connect with many of our affiliates throughout the event, both as speakers and fellow attendees.

Interested in working from Constellation for a day? Just let Isaac know the date you’ll be coming!

Contact Information

For questions, requests, or more information, email Isaac: [email protected]

Location

Please register to see the exact location of this event.

Berkeley, California

Hosted By

Unsupervised Learning 2.0 @ Constellation

​​Schedule (All Optional)

​Where?

​Who Should Attend?

​What is Constellation?

​Contact Information

Schedule (All Optional)

Where?

Who Should Attend?

What is Constellation?

Contact Information