

AI Reading Club — Attention is not Explanation
AI Reading Club — Attention is not Explanation
Our next AI Reading Club session is confirmed, and this time we’ll read and discuss Attention is not Explanation (2019).
This paper takes us into one of the most important questions in AI interpretability: when a model highlights certain words through attention, does that actually tell us why it made a decision?
Attention is often presented as something intuitive and reassuring — a way to “see” what a model is focusing on. But this paper invites us to slow down and look more carefully. It challenges a widely shared assumption and opens the door to a deeper conversation about what meaningful explanation in machine learning really looks like.
Why this paper?
Because it helps us move from surface-level interpretability to a more rigorous understanding of model behavior. It is a great paper for anyone interested in transformers, explainability, trust, and the limits of what model visualizations can really tell us.
Together, we’ll explore:
why attention is often treated as explanation
why that assumption may be misleading
what the paper shows through its experiments
what “faithful explanation” really means
why these questions still matter for today’s AI systems
Paper:
Attention is not Explanation (2019)
https://arxiv.org/abs/1902.10186
As always, the format will be simple and welcoming:
a short introduction to the paper
an open group discussion
space for questions, ideas, and different perspectives
Whether you work directly in AI, are curious about interpretability, or simply enjoy learning through discussion, you are very welcome to join.