Collaborative Filtering

Collaborative filtering is the family of techniques that recommends items to a person based not on the content of the items but on the recorded behavior of many other people. The central assumption, stated plainly in the 1994 GroupLens paper, is that people who agreed in the past will tend to agree in the future. If many users who rated the same movies as you also loved a film you have not seen, the system predicts you will love it too.

There are two main flavors. User-based collaborative filtering finds people similar to you and recommends what they liked - the approach GroupLens used. Item-based (or item-to-item) collaborative filtering, popularized by Amazon in the early 2000s, instead computes which items tend to be bought or rated together, then recommends items related to the ones you already chose. Amazon’s engineers found that the item-based version scaled far better, because the catalog of item-to-item relationships could be computed offline and the per-user recommendation was then fast.

Collaborative filtering needs only an interaction matrix - who interacted with what - and no description of the items themselves, which is both its strength and its weakness. It works across any domain where enough people leave behavior traces, but it struggles with the cold-start problem: a brand-new item with no ratings, or a brand-new user with no history, cannot be placed. Later systems combined collaborative filtering with content features and, eventually, deep learning to address this.

Why business readers should care: collaborative filtering is the quiet workhorse behind most “you might also like” features in retail, streaming, and social platforms. It turns the accumulated clicks and purchases of a customer base into an asset that improves with scale, which is part of why large platforms are hard to dislodge.

Sources

Related