Cover Image for Seminario AISAR - Joar Skalse - The Theoretical Foundations of Reward Learning
Cover Image for Seminario AISAR - Joar Skalse - The Theoretical Foundations of Reward Learning
Avatar for AISAR Seminars
Presented by
AISAR Seminars
12 Went

Seminario AISAR - Joar Skalse - The Theoretical Foundations of Reward Learning

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

🎤 Orador: Joar Skalse – PhD @ University of Oxford | Director @ DEDUCTO

📖 Título: The Theoretical Foundations of Reward Learning

Abstract: In this talk, I will provide an overview of my research on how to build a theoretical foundation for the field of reward learning, including my main motivations for pursuing this research, and some of my core results.

This research agenda involves answering questions such as: What is the right method for expressing goals and instructions to AI systems? How similar must two different goal specifications be in order to not be hackable? What is the right way to quantify the differences and similarities between different goal specifications in a given specification language? What happens if you execute a task specification that is not close to the “ideal” specification? Which specification learning algorithms are guaranteed to converge to a good specification? How sensitive are these specification learning algorithms to misspecification? If we have a bound on the error in a specification (under some metric), can we devise safe methods for optimising it?

Encontrá más detalles en: https://www.lesswrong.com/s/TEybbkyHpMEB2HTv3

Avatar for AISAR Seminars
Presented by
AISAR Seminars
12 Went