Cover Image for Diffusion Model Meetup & Paper Reading — Attention is All You Need & Transformers Model Architecture

Presented by

A group of friends who are AI/ML engineers, researchers, and entrepreneurs based in SF, New York, Boston, Waterloo, and more — learning, building, and researching together. https://aischolars.info

Hosted By

33 Went

Diffusion Model Meetup & Paper Reading — Attention is All You Need & Transformers Model Architecture

AI Scholars

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

TL;DR

In this session, we’ll walk through one of the most important papers in modern AI — Attention Is All You Need — the 2017 paper behind the transformer architecture that powers models like ChatGPT.

We’ll break down the paper step by step — understanding what “attention” means, why it changed the field, and how transformers are built. No code. Minimal math. Just a clear, intuitive walkthrough of the ideas and architecture that reshaped machine learning.

This session is part of our ongoing Diffusion Model Paper Reading Group, a friendly, online community across NY, SF, Toronto, and Boston — open to anyone curious about AI.

👌Learning Requirements

You’ll be fine as long as you’re:

Curious about how transformer models actually work
Comfortable skimming a paper and engaging in discussion
Open to learning visually and conceptually (no coding or deep math required)

🗓 Schedule

First 60 min:
We’ll walk through the “Attention Is All You Need” paper — focusing on:

The motivation and innovation behind the transformer
What “attention” really means in plain language
Key building blocks: self-attention, multi-head attention, positional encoding, and residual connections
How these concepts form the backbone of today’s GenAI models

Second 30 min:
Open discussion and Q&A — a space to clarify what still feels fuzzy and prepare together for next week’s session on Diffusion Transformers.

If you’re planning to attend next week’s Diffusion Transformers session, register here:
https://luma.com/lr2qvveq

📚 Pre-Class Learning

📄 Paper: Attention Is All You Need
https://papiers.ai/1706.03762

Pick one video based on your level of curiosity:

Easy: 3Blue1Brown – Attention in Transformers, Step-by-Step (26 min)
Medium: Yannic Kilcher – Attention Is All You Need (27 min)
Advanced: Andrej Karpathy – Stanford CS25: Introduction to Transformers (1 hr 11 min)

👥 Speakers

Led by master’s and PhD students in AI, IBM AI consultants, and CTOs of award-winning AI startups — all experienced in helping learners deeply understand transformer architecture.

The highlight of this session is clarity — by the end, you’ll understand how and why transformers work, once and for all.

🧠 About the Diffusion Model Reading Group & Bootcamp

A peer-led, 5‑month learning journey for engineers, students, researchers, and builders exploring diffusion model architectures and modern AI.

No ML background required — just curiosity
2–4 hours/week with paper readings, discussions, and final projects
Supportive community made with people who are in the industry

Presented by

AI Scholars

A group of friends who are AI/ML engineers, researchers, and entrepreneurs based in SF, New York, Boston, Waterloo, and more — learning, building, and researching together. https://aischolars.info

Hosted By

33 Went