A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

“A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting” by Yoav Freund and Robert Schapire appeared in the Journal of Computer and System Sciences, volume 55, in 1997. It introduced AdaBoost (adaptive boosting), which won its authors the 2003 Godel Prize and became one of the most influential machine learning algorithms.

Boosting answers a surprising theoretical question: if you only have access to “weak” learners that classify slightly better than random guessing, can you combine them into a single strong classifier? AdaBoost shows how. It trains a sequence of weak classifiers, and after each round it increases the weight on the examples the current ensemble gets wrong, forcing the next weak learner to focus on the hard cases. The final prediction is a weighted vote of all the weak classifiers.

In practice AdaBoost, often run with shallow decision trees as the weak learners, proved remarkably accurate and resistant to overfitting. It launched the broader family of boosting methods, including the gradient boosting and XGBoost approaches that dominate competitions and tabular-data problems today.

Why business readers should care: boosting is the lineage behind the models that consistently win on the structured, spreadsheet-style data most companies actually have, where they routinely beat both simpler methods and deep neural networks.

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Sources

Related