Personal

NICE AI Talk

NICE Talk 172 invites Bingxiang He @HBX_hbx, PhD student at Tsinghua University @TsinghuaNLP, to share Three Frontiers of Scalable RL for LLMs.

🤠 Can RL advance model capabilities without any supervised signals?

🧐 Three Frontiers, One Map: Charting the Feasible Region of Scalable RL Matters More Than Inventing Another Trick

both led to significant performance degradation.

 can paradoxically shrink—or even reverse—student gains.

https://github.com/PRIME-RL/TTRL/tree/urlvr-dev