IV — What Makes an LLM → Chapter 15
FROM SYSTEMS TO FRONTIER ML

The GPT architecture, end to end

Decoder-only stack, the residual stream as a highway, logits → sampling. Encoders vs decoders vs encoder-decoder — and why decoder-only won.

§1 The decoder-only stack — Llama 2 7B, end to end §2 Encoder vs decoder vs encoder-decoder §3 Sampling — from logits to tokens

← ALL CHAPTERS