Attention visualised
How does a language model detect relationships between words?
In the sentence:
"The cat sits on the sofa, because she is tired."
How does the model know that "she" refers to the cat?
The animation shows step by step how modern transformer models compute relationships between words: from tokens via embeddings to Q/K/V vectors and attention weights.
This makes visible how context and meaning suddenly emerge from mathematical operations.
Animation: attention mechanism with V-mixing as the closing didactic step
Step 1 · 0%
data flow from tokens to attention scores
Step 1 — The sentence
An example sentence, split into individual words.
Query token
—
Strongest match
—
Score
—
