Gaussian Processes for Machine Learning

Carl Edward Rasmussen and Christopher Williams published this book through MIT Press in 2006, and it became the definitive treatment of Gaussian processes as a machine learning method. The authors made the full text freely available online, which helped the approach spread widely among practitioners.

A Gaussian process is a way of putting a probability distribution directly over functions rather than over a fixed set of parameters. Instead of assuming the data follow, say, a straight line with unknown slope, you specify how smooth and how wiggly plausible functions should be through a covariance function, and the model then represents your uncertainty about the underlying function everywhere. The book lays out how to do regression and classification this way, and crucially how the model returns not just predictions but calibrated uncertainty about those predictions.

Beyond the core method, the book treats theoretical issues such as learning curves and the connection to other kernel machines, and it gives practical recipes for approximation when datasets are large, since the exact method scales poorly with the number of data points.

For a general reader, the appeal of Gaussian processes is honest uncertainty. A Gaussian process knows when it is extrapolating into unfamiliar territory and says so by widening its error bars, which makes the approach valuable in settings like experimental design and Bayesian optimization where knowing what you do not know is as important as the prediction itself.

Gaussian Processes for Machine Learning

Sources

Related