“Neural Ordinary Differential Equations,” submitted to arXiv on June 19, 2018 by Ricky Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud, proposed a new way to think about deep networks. Instead of stacking a fixed number of discrete layers, the authors parameterize the derivative of the hidden state with a neural network and compute the output by numerically integrating that dynamics with an off-the-shelf ODE solver. The paper won a best paper award at NeurIPS 2018.
The idea grows out of an observation about residual networks: each residual layer adds a small update to the hidden state, which looks like one step of solving a differential equation. Taking that to its limit gives a continuous-depth model. Rather than choosing how many layers to use, you let an adaptive solver decide how finely to evaluate the dynamics for each input, trading precision against speed.
Among the benefits the authors demonstrated were constant memory cost during training, achieved by a technique called the adjoint method that backpropagates through the solver without storing intermediate states, adaptive computation per input, and a new class of generative models called continuous normalizing flows.
Neural ODEs matter because they bridge deep learning with the mature mathematics of differential equations, which is central to physics, biology, and finance. The framework is especially natural for irregularly sampled time series, such as clinical records, where events do not arrive on a fixed grid.