HackTalk: Lars Malmqvist - When "Helpful" Becomes Manipulative

Apart Research Events

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

AI Manipulation Hackathon - HackTalk

Building tools to mitigate AI Manipulation alongside 500+ builders globally.

The Talk

Lars Malmqvist explores the fine line between an AI that follows instructions and one that "hacks" its reward function to tell the user exactly what they want to hear. Drawing from his extensive research synthesizing the entire field of sycophancy, Lars explores the technical landscape of model behavior. He moves beyond standard evaluations to discuss comprehensive mitigation strategies: tools designed to detect and neutralize "agreeable" bias before it compromises the integrity of AI systems.

Lars will challenge builders to look at the "how" and "why" behind model manipulation, providing a roadmap for developing defenses that prioritize truthfulness over user satisfaction.

The Speaker

Lars Malmqvist is a consultant and researcher operating at the critical intersection of Enterprise AI and AI Safety. Professionally, he is a partner at Research and Implementation, where he guides organizations in highly regulated sectors - such as the public sector and life sciences - through complex AI initiatives.

Academically, Lars is affiliated with DIS Copenhagen. His research focuses on the behavioral vulnerabilities of Large Language Models, specifically sycophancy and reward hacking. He is the author of a definitive survey on sycophancy that synthesized the current state of the research field with a rigorous evaluation of mitigation techniques.

Why this matters

In high-stakes, regulated industries, an AI that "pleases" at all costs is a liability. Whether in clinical trials or public policy, sycophantic behavior can lead to a dangerous echo chamber that masks errors and compromises safety.

This talk provides the technical "ground truth" needed to build robust defenses. Lars will demonstrate how to move from recognizing the problem of manipulative agreement to implementing the mitigation strategies required to ensure AI remains a reliable partner in critical decision-making.

Hosted by Apart research

Presented by

Apart Research Events

Hosted By

HackTalk: Lars Malmqvist - When "Helpful" Becomes Manipulative

​AI Manipulation Hackathon - HackTalk

​The Talk

​The Speaker

​Why this matters

AI Manipulation Hackathon - HackTalk

The Talk

The Speaker

Why this matters