Personal

NICE AI Talk

Speaker: Yufan Zhuang, PhD student at UCSD

Host: Haolun Wu, PhD student at Mila&McGill

“Models can do self-improving without training” 

Key findings are counterintuitive:  🧐The Test-time Recursive Thinking (TRT) framework enables LLMs to self-improve reasoning through iterative knowledge accumulation, combining strategic rollout generation, self-verification-based solution selection, and contrastive failure analysis without external supervision. 😎TRT achieves significant accuracy gains: open-source models reach 100% on AIME benchmarks, while closed-source models improve by 10.4～14.8 percentage points on LiveCodeBench’s hardest problems through self-generated test execution and adaptive exploration strategies.  

Test-time Recursive Thinking:Self-Improvement without External Feedback