Steven Broschart
DE·EN
Contact
Attention visualised

How does a language model detect relationships between words?

In the sentence:

"The cat sits on the sofa, because she is tired."

How does the model know that "she" refers to the cat?

The animation shows step by step how modern transformer models compute relationships between words: from tokens via embeddings to Q/K/V vectors and attention weights.

This makes visible how context and meaning suddenly emerge from mathematical operations.

Animation: attention mechanism with V-mixing as the closing didactic step

Attention: How a language model decides which words matter when understanding another word.
Step 1 · 0%
data flow from tokens to attention scores
Step 1 — The sentence
An example sentence, split into individual words.
Query token
Strongest match
Score
← Previous animationWhy does AI need GPUs? ↑ Back to overview Next animation →How does language emerge?