Compute power visualised

Why do language models need GPUs?

Modern language models consist of billions of parameters that need to be computed for every single token.

The animation reveals why classical CPUs quickly hit their limits - and why GPUs, with their thousands of compute units, have become so decisive for neural networks.

Real benchmark data lets you compare how much text different systems produce in the same amount of time.

Model

CPU

GPU

Note: Apple Silicon is actually an integrated system combining CPU and GPU. Listed here under the GPU category because the Metal GPU handles the LLM computation.

Phase 1 · 0%

Phase 1 - The task

cpu · a few strong cores

CPU

A CPU has strong cores. They work in parallel - but there are far too few of them for the data volumes of a language model.

gpu · thousands of small cores

GPU

A GPU has thousands of smaller cores. Same principle as the CPU - only with hundreds of times more parallel compute units.

Multiplications

CPU per token

GPU per token

was schreiben cpu und gpu in 10 sekunden?

Model output:

← Previous animationHow does a neural network learn? ↑ Back to overview Next animation →How does AI understand context?

Why do language models need GPUs?

Animation: Why does AI need GPUs - with selectable hardware