node2vec: Scalable Feature Learning for Networks

node2vec, introduced by Aditya Grover and Jure Leskovec in a paper submitted to arXiv on July 3, 2016, learns continuous vector representations of the nodes in a network. The goal is a mapping into a low-dimensional space that preserves which nodes share neighborhoods, so that downstream tasks like classification and link prediction become straightforward machine learning problems on those vectors.

The key idea is biased random walks. node2vec runs many random walks across the graph, but the walk is controlled by two parameters that interpolate between breadth-first behavior (staying close to the start node, capturing community structure) and depth-first behavior (wandering far, capturing structural roles). The sequences of nodes produced by these walks are then fed into a word2vec-style skip-gram model, treating walks like sentences and nodes like words.

This flexibility was the paper’s main advance over earlier walk-based methods, which used a single fixed exploration strategy. By tuning the two parameters, node2vec could capture different notions of node similarity for different tasks, and the authors showed improved performance on multi-label classification and link prediction across several real-world networks.

For organizations, node2vec offered a simple, scalable recipe to turn any network, such as a social graph or a co-purchase graph, into features usable by standard models, without needing the labeled data or heavier machinery that later deep graph networks required.

node2vec: Scalable Feature Learning for Networks

Sources

Related