Cover Image for Week 45: Attention Residuals: Rethinking Information Flow in LLMs
Cover Image for Week 45: Attention Residuals: Rethinking Information Flow in LLMs
Avatar for 90/30 Club
Presented by
90/30 Club
We meet weekly in-person to talk about new ML papers! Come and join the discussion!
35 Going

Week 45: Attention Residuals: Rethinking Information Flow in LLMs

Registration
Welcome! To join the event, please register below.
About Event

​Paper Link

This paper introduces Attention Residuals (AttnRes), a novel architectural mechanism by the Kimi Team designed to rethink how information flows in modern Large Language Models (LLMs). The central challenge addressed in the work is that standard residual connections with PreNorm accumulate all layer outputs using fixed unit weights. This uniform aggregation causes uncontrolled hidden-state growth with depth, progressively diluting each layer's unique contribution. To overcome this limitation, the authors replace this fixed accumulation with a softmax attention mechanism over preceding layer outputs, allowing each layer to selectively aggregate earlier representations using learned, input-dependent weights.

​Join us at Mox!

​🔎Analyzed Papers ​​Discussion at 20:00, (optional) quiet reading from 19:00 to 20:00.

Location
1680 Mission St
San Francisco, CA 94103, USA
4th Floor @ Mox sf (moxsf.com)
Avatar for 90/30 Club
Presented by
90/30 Club
We meet weekly in-person to talk about new ML papers! Come and join the discussion!
35 Going