Meta releases OPT-175B with a public training logbook

In May 2022, Meta AI released OPT (Open Pre-trained Transformer), a suite of decoder-only language models from 125 million up to 175 billion parameters. The flagship OPT-175B matched the scale of GPT-3 and, Meta said, performed comparably while requiring only about one-seventh of the carbon footprint to develop. Weights for the full set were made available to researchers, an unusually open move for a model of that size at the time.

The release was notable as much for its transparency as its scale. Alongside the weights and code, Meta published a logbook documenting the day-to-day reality of training a 175-billion-parameter model: hardware failures, loss spikes, restarts, and the engineering judgment calls made along the way. That candor was rare in a field where large training runs were usually described only after they succeeded.

OPT-175B, followed two months later by BigScience’s BLOOM, opened the era of GPT-3-scale open models. Together they let outside researchers probe how very large language models behave without going through a commercial API, and helped set expectations for documenting training rather than only reporting final benchmarks.

Meta releases OPT-175B with a public training logbook

Sources

Related