EleutherAI announced GPT-NeoX-20B, a 20-billion-parameter autoregressive language model, in February 2022, with the weights made available for download under an Apache 2.0 license on February 9 and a full paper following in April. At the time of its release it was, in EleutherAI’s words, the largest publicly accessible pretrained general-purpose autoregressive language model - a notable achievement for a grassroots collective that had begun as a Discord server in 2020.
GPT-NeoX-20B was trained on the Pile, the 825 GiB openly documented dataset EleutherAI had assembled, in partnership with CoreWeave, and the team released both the weights and the training and evaluation code under a permissive license. The model showed particularly strong few-shot reasoning, with larger gains in five-shot evaluation than comparably sized GPT-3 and FairSeq models.
The release continued EleutherAI’s project of building open alternatives to the closed models coming out of large commercial labs. Coming just before Meta’s OPT-175B and BigScience’s BLOOM, GPT-NeoX-20B was part of a wave that put capable large language models into public hands and demonstrated that a volunteer-driven group could operate at the frontier of open model scale.