Self-Attention, Step by Step
Queries, keys, and values.
What Is Self-Attention?
In self-attention, every word looks at every other word in the same sentence to build a richer, context-aware version of itself. 🔍
Three Roles per Word
Each word is projected into three vectors: a query, a key, and a value. These three roles drive the whole self-attention computation.
All lessons in this course
- The Idea of Attention
- Self-Attention, Step by Step
- Multi-Head Attention and Positions
- Inside the Transformer Block