Collaborative Filtering for Implicit Feedback Datasets

Most early recommender research assumed explicit feedback: users handing the system star ratings. In practice that data is rare. What companies actually have is implicit feedback, the trail of behavior left by watching a show, clicking a link, or buying a product. In 2008 Yifan Hu, Yehuda Koren, and Chris Volinsky published a paper at the IEEE International Conference on Data Mining (ICDM 2008, pages 263-272) showing how to turn that messy behavioral data into a working recommender.

The key insight is that implicit signals are fundamentally different from ratings. A purchase says someone probably liked an item, but never buying something does not mean they dislike it; they may simply not know it exists. The authors model this by separating two ideas: a binary preference (did the user interact at all) and a confidence level (how strongly we believe that preference, growing with the number of interactions). They then fit a matrix factorization model that accounts for the full matrix, including all the unobserved entries, using an efficient alternating least squares procedure that scales linearly with the data.

The method became one of the most influential recommendation algorithms in industry and won the 2017 IEEE ICDM 10-Year Highest-Impact Paper Award. Its confidence-weighted formulation is implemented in widely used libraries and powers recommendation from streaming catalogs to e-commerce.

For a general reader, this paper is why services can recommend well without ever asking you to rate anything: the system learns from what you do, not from what you say.

Collaborative Filtering for Implicit Feedback Datasets

Sources

Related