Cover Image for Scaling Latent Reasoning via Looped Language Model
Cover Image for Scaling Latent Reasoning via Looped Language Model
Hosted By
11 Went

Scaling Latent Reasoning via Looped Language Model

Hosted by NICE AI Talk
YouTube
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Welcome to NICE talk 122! (This talk will be done in Chinese.)

Youtube livestream: https://youtube.com/live/xdpM8IS8Hqg

This time, we will explore Looped Language Model.

Current large language models primarily rely on explicit text generation (such as Chain-of-Thought/CoT) for reasoning, but this approach defers the cultivation of reasoning capabilities to the post-training phase, failing to fully leverage the massive pretraining data. We will introduce our latest work, Ouro, a family of Loop Language Models (LoopLM) that directly integrates reasoning capabilities into the pretraining phase. Its core innovations include: (1) iterative computation in latent space, (2) learned depth allocation through entropy regularization, and (3) large-scale training on 7.7 trillion tokens. Despite having only 1.4B and 2.6B parameters, Ouro models achieve performance levels comparable to state-of-the-art 12B parameter models across a wide range of benchmarks. Our synthetic data experiments reveal that Ouro's advantages do not stem from greater knowledge capacity, but rather from stronger knowledge manipulation capabilities. Additionally, we will discuss how LoopLM generates reasoning traces that are more aligned with final outputs, and how this direction opens up new possibilities for model scaling in the reasoning era.

Speaker: Ruijie Zhu

Ruijie Zhu is a third-year PhD student at UCSC, primarily researching LLM efficiency and scalability. His research focuses on model architectures that break through the limitations of standard Transformers. His early work explored linear attention mechanisms and recurrent sequence modeling, while currently he is more focused on research in Scaling Latent Reasoning, achieving high performance with lower computational overhead.

Host: Wenyue Hua

Wenyue Hua is a senior researcher at Microsoft Research, AI Frontiers. Her research interests lie in Large Language Models and its various application, such as LLM-based agent, multi-agent system, generative recommender system, LLM reasoning. She cares about the decision-making ability, safety, and efficiency of LLM-based agents.

Hosted By
11 Went