

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power – Jobst Heitzig
Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power
Jobst Heitzig – Senior Mathematician AI Safety Designer
Power is a key concept in AI safety: power-seeking as an instrumental goal, sudden or gradual disempowerment of humans, power balance in human-AI interaction and international AI governance. At the same time, power as the ability to pursue diverse goals is essential for wellbeing.
This talk explores the idea of promoting both safety and wellbeing by forcing AI agents explicitly to empower humans and to manage the power balance between humans and AI agents in a desirable way. Using a principled, partially axiomatic approach, we design a parametrizable and decomposable objective function that represents an inequality- and risk-averse long-term aggregate of human power. It takes into account humans’ bounded rationality and social norms, and, crucially, considers a wide variety of possible human goals. By design, an agent that would fully maximize this metric would be "guaranteed" (relative to the used world model) to not disempower (in the sense of the used definition of "power") humanity. Still, we propose to only softly maximize the metric to account for model error and aspects of power not captured by our metric.
We derive algorithms for computing that metric by backward induction or approximating it via a form of multi-agent reinforcement learning from a given world model. We exemplify the consequences of (softly) maximizing this metric in a variety of paradigmatic situations and describe what instrumental sub-goals it will likely imply. Our cautious assessment is that softly maximizing suitable aggregate metrics of human power might constitute a beneficial objective for agentic AI systems that is safer than direct utility-based objectives.
Paper to read: https://arxiv.org/abs/2508.00159
Guaranteed Safe AI seminars
The monthly seminar series on Guaranteed Safe AI brings together researchers to advance the field of building AI with high-assurance quantitative safety guarantees.
Follow the series
Apply to speak
Video recordings
Mailing list
Series info website
Feedback
Donate 🧡