Amazon's item-to-item collaborative filtering

In January 2003, Greg Linden, Brent Smith, and Jeremy York published “Amazon.com Recommendations: Item-to-Item Collaborative Filtering” in IEEE Internet Computing (volume 7, number 1, pages 76 to 80). The paper described the algorithm behind Amazon’s “Customers who bought this item also bought” feature, which by then had been running for about six years.

The key move was to flip the usual approach. Earlier collaborative filtering matched a shopper to other shoppers with similar histories, which became slow and unreliable as the number of customers and products exploded. Amazon instead built a table of relationships between items: product B is related to product A if customers who bought A are unusually likely to also buy B. That item-to-item table could be computed offline, so generating recommendations for any individual customer at page-load time became fast and worked even for shoppers with very little history. According to Amazon, the team later corrected a subtle error in the math - properly weighting the increased likelihood of a purchase rather than comparing raw probabilities - which produced a large jump in recommendation quality.

The paper became one of the most cited in the field. In 2017, on its twentieth anniversary, IEEE Internet Computing named it the single article from its history that had best withstood the test of time.

Why business readers should care: this is the engineering pattern that made personalized recommendation work at internet scale, and it remains a textbook example of how a smart reframing of a problem can beat a brute-force approach on both cost and quality.