PyTorch

PyTorch is an open-source deep learning framework first released by Facebook AI Research in 2016. It grew out of Torch, an older scientific computing framework whose numeric core was written in C and scripted in Lua, and reimagined that core around Python and a NumPy-like tensor library. The result was a framework that felt like ordinary Python numerical code but carried the two ingredients deep learning needs: hardware acceleration on GPUs and automatic differentiation.

The defining design choice was define-by-run, also called eager execution. Where the first generation of TensorFlow asked you to build a static graph and then execute it separately, PyTorch executes each operation as the Python interpreter reaches it and builds the differentiation graph dynamically as the program runs. This is the central argument of the NeurIPS 2019 paper “PyTorch: An Imperative Style, High-Performance Deep Learning Library” (Paszke et al.): that an imperative, Pythonic style which makes models easy to write and debug does not have to cost performance, and that careful engineering of the underlying subsystems can deliver both.

Two abstractions carry most of the framework. The Tensor is a multidimensional array that behaves much like a NumPy array but can live on a GPU and participate in gradient tracking. Autograd is the automatic differentiation engine: as operations run on tensors that require gradients, it records them, and a single call to backward walks that record in reverse to compute derivatives. Because the graph is rebuilt on every forward pass, control flow such as Python loops and conditionals can vary per input, which made dynamic models and rapid experimentation natural.

This dynamic, debuggable style is what won research adoption. A researcher could drop a print statement or a debugger breakpoint into the middle of a model and inspect real tensor values, because the model was just running Python. Over the second half of the 2010s PyTorch overtook the earlier static-graph frameworks as the default tool in academic papers and machine-learning research labs.

Production was initially PyTorch’s weaker side, since a model that is just a running Python program is harder to export and optimize. The project answered this with tracing and scripting tools that capture a model into a serializable, deployable form, and later with compiler work that fuses and optimizes the eager program without forcing the author back into a static-graph world. PyTorch eventually moved under an independent foundation, cementing its place as one of the two frameworks, alongside TensorFlow, that define modern deep-learning software.

Sources

Related