Regression Shrinkage and Selection via the Lasso

“Regression Shrinkage and Selection via the Lasso” by Robert Tibshirani appeared in the Journal of the Royal Statistical Society, Series B, volume 58, issue 1, in 1996 (pages 267-288). It introduced the lasso, short for “least absolute shrinkage and selection operator.”

The method fits an ordinary least-squares regression but adds a constraint: the sum of the absolute values of the coefficients must stay below a fixed budget. This L1 penalty does something ridge regression’s squared (L2) penalty does not - it drives some coefficients to be exactly zero. The lasso therefore performs variable selection and shrinkage at the same time, producing models that are both more accurate on new data and easier to interpret because they use fewer predictors.

The lasso became one of the most cited results in modern statistics and a workhorse of high-dimensional data analysis, where there are often more candidate variables than observations. It is a standard tool in genomics, finance, and any setting where automatic feature selection matters, and it is built into common libraries including scikit-learn.

Why business readers should care: when a dataset has hundreds or thousands of possible inputs, the lasso automatically picks the handful that actually predict the outcome, giving simpler models that are cheaper to deploy and easier to explain to regulators and stakeholders.

Sources

Last verified June 7, 2026