Roberts' Machine Perception of Three-Dimensional Solids

In 1963 Lawrence “Larry” Roberts completed a PhD thesis at MIT titled “Machine Perception of Three-Dimensional Solids,” widely regarded as the first dissertation in computer vision. Rather than tackle the full complexity of photographs of the real world, Roberts worked with line drawings of simple polyhedral objects - the “blocks world” approach that would shape early AI - and asked how a computer could recover their three-dimensional structure from a two-dimensional image.

His program took a photograph of block-like solids, found edges, and then fit known geometric models to reconstruct the objects’ shapes and positions in space. The thesis introduced an algorithm for removing hidden lines and surfaces from a perspective projection, a foundational idea that still underlies computer graphics, CAD software, and 3D rendering. By constraining the problem to clean line drawings, Roberts could solve fundamental geometric problems that would have been intractable on messy natural images.

The work helped establish machine perception as a research discipline and set an agenda that ran through the field for decades: build internal 3D models from 2D input. Roberts later left vision research and became a principal architect of the ARPANET, the network that grew into the internet. His thesis is preserved and openly available through MIT’s DSpace repository.

Roberts' Machine Perception of Three-Dimensional Solids

Sources

Related