Safe Learning Under Irreversible Dynamics via Asking for Help – Benjamin Plaut

Guaranteed Safe AI Seminars

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Safe Learning Under Irreversible Dynamics via Asking for Help
Benjamin Plaut – Postdoc at CHAI studying guaranteed-safe AI

Most online learning algorithms with formal guarantees essentially rely on trying all possible behaviors, which is problematic when some errors cannot be recovered from. Instead, we allow the learning agent to ask for help from a mentor and to transfer knowledge between similar states. We show that this combination enables the agent to learn both safely and effectively. Under standard online learning assumptions, we provide an algorithm whose regret and number of mentor queries are both sublinear in the time horizon for any Markov Decision Process (MDP), including MDPs with irreversible dynamics. Conceptually, our result may be the first formal proof that it is possible for an agent to obtain high reward while becoming self-sufficient in an unknown, unbounded, and high-stakes environment without resets.

Paper to read: https://arxiv.org/abs/2502.14043

Guaranteed Safe AI seminars

The monthly seminar series on Guaranteed Safe AI brings together researchers to advance the field of building AI with high-assurance quantitative safety guarantees.

Presented by

Guaranteed Safe AI Seminars

Monthly seminars on Guaranteed Safe AI R&D. https://www.horizonevents.info/guaranteedsafeaisem…

Hosted By

23 Went

AI

Safe Learning Under Irreversible Dynamics via Asking for Help – Benjamin Plaut

​​​Guaranteed Safe AI seminars

Guaranteed Safe AI seminars