A 1.3B InstructGPT was preferred over the 175B GPT-3

fact

In the InstructGPT paper, “Training language models to follow instructions with human feedback,” the authors report that human labelers preferred outputs from the 1.3-billion-parameter InstructGPT model over outputs from the 175-billion-parameter GPT-3, even though InstructGPT had roughly 100 times fewer parameters. The result showed that aligning a model with human feedback could matter more than raw size.

Sources

PRIMARY https://arxiv.org/abs/2203.02155

Last verified June 6, 2026

<- Back to the AI Library

A 1.3B InstructGPT was preferred over the 175B GPT-3

Sources

Related