Name: RL Verifiers by Yash More (Cerebras)
Start: 2025-09-17T18:00:00.000-04:00
End: 2025-09-17T21:00:00.000-04:00
Location: Studio 535

Subscribe to event notifications at 535toronto.substack.com

Studio 535

 will give an overview of RLVR and verifiers. 

Reinforcement Learning with Verifiable Rewards (RLVR) replaces noisy human feedback with deterministic signals that make verification robust. In this talk, we will examine what makes a reward verifiable, how credit can be effectively assigned through process and outcome supervision, and the algorithms that enable RLVR to scale and shape reasoning capabilities of LLMs.

Back entrance, where the parking lot is, our door has a 535 sticker on it