Steven Broschart
DE·EN
Contact
Compute power visualised

Why do language models need GPUs?

Modern language models consist of billions of parameters that need to be computed for every single token.

The animation reveals why classical CPUs quickly hit their limits — and why GPUs, with their thousands of compute units, have become so decisive for neural networks.

Real benchmark data lets you compare how much text different systems produce in the same amount of time.

Animation: Why does AI need GPUs - with selectable hardware

Where the compute load lives · Every connection in the network is a multiplication. They are independent of each other — so they can run in parallel. The only question is: across how many cores?
Note: Apple Silicon is actually an integrated system combining CPU and GPU. Listed here under the GPU category because the Metal GPU handles the LLM computation.
Phase 1 · 0%
Phase 1 — The task
cpu · a few strong cores
CPU
A CPU has strong cores. They work in parallel — but there are far too few of them for the data volumes of a language model.
gpu · thousands of small cores
GPU
A GPU has thousands of smaller cores. Same principle as the CPU — only with hundreds of times more parallel compute units.
Multiplications
CPU per token
GPU per token
was schreiben cpu und gpu in 10 sekunden?
Model output:
← Previous animationHow does a neural network learn? ↑ Back to overview Next animation →How does AI understand context?