Cover Image for HackTalk: Lars Malmqvist - When "Helpful" Becomes Manipulative
Cover Image for HackTalk: Lars Malmqvist - When "Helpful" Becomes Manipulative
Avatar for Apart Research Events
Trevor Lohrbeer
invites you to join

HackTalk: Lars Malmqvist - When "Helpful" Becomes Manipulative

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

AI Manipulation Hackathon - HackTalk

Building tools to mitigate AI Manipulation alongside 500+ builders globally.

The Talk

Lars Malmqvist explores the fine line between an AI that follows instructions and one that "hacks" its reward function to tell the user exactly what they want to hear. Drawing from his extensive research synthesizing the entire field of sycophancy, Lars explores the technical landscape of model behavior. He moves beyond standard evaluations to discuss comprehensive mitigation strategies: tools designed to detect and neutralize "agreeable" bias before it compromises the integrity of AI systems.

Lars will challenge builders to look at the "how" and "why" behind model manipulation, providing a roadmap for developing defenses that prioritize truthfulness over user satisfaction.

The Speaker

Lars Malmqvist is a consultant and researcher operating at the critical intersection of Enterprise AI and AI Safety. Professionally, he is a partner at Research and Implementation, where he guides organizations in highly regulated sectors - such as the public sector and life sciences - through complex AI initiatives.

Academically, Lars is affiliated with DIS Copenhagen. His research focuses on the behavioral vulnerabilities of Large Language Models, specifically sycophancy and reward hacking. He is the author of a definitive survey on sycophancy that synthesized the current state of the research field with a rigorous evaluation of mitigation techniques.

Why this matters

In high-stakes, regulated industries, an AI that "pleases" at all costs is a liability. Whether in clinical trials or public policy, sycophantic behavior can lead to a dangerous echo chamber that masks errors and compromises safety.

This talk provides the technical "ground truth" needed to build robust defenses. Lars will demonstrate how to move from recognizing the problem of manipulative agreement to implementing the mitigation strategies required to ensure AI remains a reliable partner in critical decision-making.

Hosted by Apart research

Avatar for Apart Research Events