

Rubrics as Reward: RL Beyond Verifiable Domains | Scale AI x YAMLRG x EF
Rubrics as Reward: Reinforcement Learning Beyond Verifiable Domains
How do we train AI systems when there is no clear ground truth? Reinforcement learning with human feedback (RLHF) has pushed the field forward, but it struggles in messy domains where “correctness” is ambiguous.
In this session, Aidan Davies (Solution Engineer at Scale AI) and Matthew Spaul (Strategist at Scale AI) will explore how rubrics as reward can move reinforcement learning beyond verifiable domains. Together, they’ll share technical insights from building applied systems and strategic lessons from working with governments and enterprise adopting AI in real world environments.
This event is part of Yet Another Machine Learning Reading Group (YAMLRG.com): a community of ML engineers, researchers, and hobbyists who meet regularly to discuss and debate new ideas in AI and Entrepreneurs First, the world’s leading talent investor, backing exceptional individuals to build globally important technology companies from scratch.