HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems

Apart Research Events

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

AI Manipulation Hackathon - HackTalk

Building tools to mitigate AI Manipulation alongside 500+ builders globally.

The Talk

When do LLMs become too friendly? What are we measuring: "agreeableness bias", "alignment", or "manipulation"? Sycophancy received a lot of attention but remains a major research challenge.

Jan Batzner works on the critical distinction between beneficial personalization and harmful sycophantic model behavior. Drawing from his work on measurement methodologies, Jan explains what more rigorous evaluation infrastructure needs. Before researchers can claim manipulation scores and mitigation approaches, the current state of model evaluations and their validity needs to be improved.

The Speaker

Jan Batzner is a PhD Candidate in Computer Science at the Technical University of Munich and a researcher at the Weizenbaum Institute Berlin. A Junior Member of the Munich Center for Machine Learning and a Columbia University graduate, Jan co-chairs the EvalEval coalition. His work focuses on sycophancy and developing more rigorous model evaluation infrastructure.

Why this matters

If a model is designed to please at all costs, factuality and helpfulness decrease. This talk provides a quantitative tour of model sycophancy evaluations until today. Jan will show you what we know about today's sycophancy and which gaps remain in our evaluation approaches.

Hosted by Apart research

Presented by

Apart Research Events

Hosted By

AI

HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems

​AI Manipulation Hackathon - HackTalk

​The Talk

​The Speaker

​Why this matters

AI Manipulation Hackathon - HackTalk

The Talk

The Speaker

Why this matters