Aggregated Residual Transformations (ResNeXt)

“Aggregated Residual Transformations for Deep Neural Networks” was submitted to arXiv in November 2016 by Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, and Kaiming He, then at UC San Diego and Facebook AI Research. It introduced the architecture known as ResNeXt and a new way to think about scaling networks.

Conventional wisdom held that you made a network more capable by making it deeper or wider. ResNeXt added a third axis the authors called cardinality, the number of parallel transformation paths inside a block. Each block splits its input across many identical, low-dimensional branches, transforms each one, and sums the results, a design borrowed in spirit from the Inception module but made uniform and repeatable rather than hand-tuned per layer. The striking finding was that, given a fixed compute budget, increasing cardinality improved accuracy more reliably than going deeper or wider, and it did so with a simpler, more regular structure.

ResNeXt placed second in the 2016 ImageNet classification challenge and became a widely used backbone, particularly the grouped-convolution implementation that made the many-branch design efficient on real hardware. It also reframed an architectural debate, suggesting that how a network divides its work, not just how big it is, is a first-class design lever.

For a business reader, ResNeXt is another reminder that AI progress is not only about scale: identifying a new, cleanly tunable dimension of a model can yield better results without simply spending more.

Sources

Last verified June 7, 2026