arXiv

arXiv is the open-access preprint repository that has become the de facto publishing channel for AI research. Its about page states plainly that “arXiv was founded by Paul Ginsparg in 1991” and describes it as “a curated research-sharing platform open to anyone.” It began at Los Alamos as a way for physicists to circulate papers instantly, was hosted for decades by Cornell University, and now hosts more than three million scholarly articles across subjects including computer science, mathematics, and statistics.

In machine learning, arXiv reshaped the rhythm of the field. Rather than waiting months for a conference or journal to accept a paper, researchers post to arXiv the day a result is ready, and the community reads, cites, and builds on it immediately. The Transformer paper “Attention Is All You Need” was an arXiv preprint in June 2017; GPT, BERT, ResNet, and most other landmark results circulated as arXiv preprints first. Conference acceptance then functions as a later stamp of peer review on work the field has often already absorbed.

This speed comes with a trade-off: arXiv applies light moderation, not peer review, so preprints can contain errors or overstated claims that only later scrutiny corrects. The platform is now establishing itself as an independent non-profit after its long partnership with Cornell.

Why business readers should care: arXiv is where you can read the actual primary source behind an AI headline, usually free and usually weeks before the press coverage, which makes it the single most useful place to check what a new model or method really claims.

Sources

Last verified June 7, 2026