Speculative Execution

Speculative execution is the technique of carrying out work before the processor is certain the work will be needed, on the bet that it usually will be. The most common case is branch speculation: when the processor predicts the direction of a conditional branch, it does not merely fetch down the predicted path, it actually executes those instructions. If the prediction proves correct, the results are kept and the work is already done; if it proves wrong, the speculative results are discarded as if they never happened, and the processor restarts down the correct path. Speculation turns a branch predictor from a fetch hint into a genuine performance engine, because it lets the machine fill its execution units with useful work across branch boundaries.

Hennessy and Patterson’s “Computer Architecture: A Quantitative Approach” treats speculation as the natural extension of branch prediction and out-of-order execution: a reorder buffer holds speculative results until the branch they depend on is resolved, at which point they are either committed in program order or squashed. Because the architectural state is updated only at retirement, a correct implementation guarantees that wrong-path instructions leave no visible effect on registers or memory. This in-order commit on top of out-of-order, speculative execution is what reconciles aggressive performance with the simple sequential model the programmer sees.

The crucial subtlety, and the source of a major class of security flaws, is that “no visible effect” was only ever true of architectural state. Speculatively executed instructions do leave microarchitectural traces, most importantly in the cache. A speculative load can pull data into the cache, and even after the speculative work is discarded, that cache state persists and can be measured by timing.

This is precisely the mechanism the 2018 Spectre paper, “Spectre Attacks: Exploiting Speculative Execution” by Kocher and colleagues, weaponized. The attack trains the branch predictor so the victim speculatively executes code that reads secret data and uses it to index a memory array; the discarded speculation leaves a cache footprint that a side-channel measurement recovers. Because speculation past branches is built into nearly every high-performance processor from Intel, AMD, and ARM, the vulnerability was effectively industry-wide.

Spectre and the related Meltdown attack forced a reassessment of a decades-old assumption that microarchitectural state could be ignored for security. Mitigations range from compiler-inserted speculation barriers to hardware changes that limit what can be speculatively accessed, but the underlying tension remains: speculation is fundamental to processor performance, and constraining it costs speed.

Sources

Related