InstructGPT's human feedback came from about 40 contractors

fact

In the InstructGPT paper, “Training language models to follow instructions with human feedback,” OpenAI reports that it “hired a team of about 40 contractors on Upwork and through ScaleAI” to produce the demonstration and comparison data used to align the model. These labelers wrote example responses and ranked model outputs, and the paper notes they were chosen via a screening test measuring sensitivity to harmful content and their agreement rate with researchers. The authors acknowledge that this small, mostly English-speaking group was “clearly not representative of the full spectrum of people who will use and be affected by our deployed models.” The technique behind ChatGPT’s helpfulness rested on the judgments of roughly forty people.

Sources

PRIMARY https://arxiv.org/abs/2203.02155

Last verified June 7, 2026

<- Back to the AI Library

InstructGPT's human feedback came from about 40 contractors

Sources

Related