An O(ND) Difference Algorithm and Its Variations (1986)

In 1986 Eugene W. Myers published “An O(ND) Difference Algorithm and Its Variations” in the journal Algorithmica. The paper gives a precise, efficient way to answer the question that tools like diff ask: what is the smallest set of insertions and deletions that turns one file into another?

Myers’s key move was to frame the problem as a search through an “edit graph,” where finding the shortest sequence of edits is the same as finding a shortest path. The resulting algorithm runs in time proportional to N times D, where N is the combined length of the two inputs and D is the number of differences between them. Because D is usually small when two versions of a file are similar, the method is fast in exactly the everyday cases that matter, comparing two close revisions of the same source code.

The paper also describes refinements, including a variation that uses far less memory by working from both ends of the inputs toward the middle. These details made the algorithm practical to drop into real tools rather than just describe on paper.

Myers’s algorithm became the default engine for computing differences in much of modern software, including widely used implementations behind diff and the version-control system Git. When a programmer sees a clean, minimal list of added and removed lines in a code review or a commit, this 1986 result is usually the reason it looks that way.