Cover Image for HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems
Cover Image for HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems
Avatar for Apart Research Events
Trevor Lohrbeer
invites you to join

HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

AI Manipulation Hackathon - HackTalk

Building tools to mitigate AI Manipulation alongside 500+ builders globally.

The Talk

When do LLMs become too friendly? What are we measuring: "agreeableness bias", "alignment", or "manipulation"? Sycophancy received a lot of attention but remains a major research challenge.

Jan Batzner works on the critical distinction between beneficial personalization and harmful sycophantic model behavior. Drawing from his work on measurement methodologies, Jan explains what more rigorous evaluation infrastructure needs. Before researchers can claim manipulation scores and mitigation approaches, the current state of model evaluations and their validity needs to be improved.

The Speaker

Jan Batzner is a PhD Candidate in Computer Science at the Technical University of Munich and a researcher at the Weizenbaum Institute Berlin. A Junior Member of the Munich Center for Machine Learning and a Columbia University graduate, Jan co-chairs the EvalEval coalition. His work focuses on sycophancy and developing more rigorous model evaluation infrastructure.

Why this matters

If a model is designed to please at all costs, factuality and helpfulness decrease. This talk provides a quantitative tour of model sycophancy evaluations until today. Jan will show you what we know about today's sycophancy and which gaps remain in our evaluation approaches.

Hosted by Apart research

Avatar for Apart Research Events