

HackTalk: Jan Batzner - Towards Evaluating Sycophancy in AI Systems
AI Manipulation Hackathon - HackTalk
Building tools to mitigate AI Manipulation alongside 500+ builders globally.
The Talk
When do LLMs become too friendly? What are we measuring: "agreeableness bias", "alignment", or "manipulation"? Sycophancy received a lot of attention but remains a major research challenge.
Jan Batzner works on the critical distinction between beneficial personalization and harmful sycophantic model behavior. Drawing from his work on measurement methodologies, Jan explains what more rigorous evaluation infrastructure needs. Before researchers can claim manipulation scores and mitigation approaches, the current state of model evaluations and their validity needs to be improved.
The Speaker
Jan Batzner is a PhD Candidate in Computer Science at the Technical University of Munich and a researcher at the Weizenbaum Institute Berlin. A Junior Member of the Munich Center for Machine Learning and a Columbia University graduate, Jan co-chairs the EvalEval coalition. His work focuses on sycophancy and developing more rigorous model evaluation infrastructure.
Why this matters
If a model is designed to please at all costs, factuality and helpfulness decrease. This talk provides a quantitative tour of model sycophancy evaluations until today. Jan will show you what we know about today's sycophancy and which gaps remain in our evaluation approaches.
Hosted by Apart research