AI Safety Fellowship

Name: AI Safety Fellowship
Start: 2026-03-19T18:30:00.000+01:00
End: 2026-03-19T20:00:00.000+01:00
Location: EPFL

Safe AI Lausanne

EPFL

Lausanne, Vaud

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Curious about why aligning superhuman AI systems is one of the hardest open problems in computer science? Join us for a 4-week technical reading group exploring the core arguments and unsolved challenges of AI alignment.

Format: Every Thursday from March 19 to April 16 (skipping April 9 — Easter break), 18:30–20:00. All reading is done on-site during the session — no homework. We read for 40 minutes, then dive into a structured technical discussion. Free dinner provided.

What we'll cover: — Why alignment is fundamentally different from debugging (Week 1) — Specification gaming & the limits of RLHF (Week 2) — Inner alignment, mesa-optimizers & deceptive alignment (Week 3) — Scalable oversight & weak-to-strong generalization (Week 4)

Core text: The AI Safety Atlas (CeSIA)

Who this is for: EPFL/UNIL students (BSc/MSc), mostly with technical background but everybody is welcomed. No prior AI safety knowledge needed, but we assume you're comfortable with ML basics (reward functions, optimization, training loops).

Commitment: This is a 4-session fellowship. We expect you to attend at least three sessions. Please refrain from signing up if you cannot attend at least 3 sessions.

Dates: March 19 · March 26 · April 2 · April 16

Location

EPFL

1015 Lausanne, Switzerland

ROOM CM012

Presented by

Safe AI Lausanne

https://go.epfl.ch/sail

Hosted By

16 Went

IA