The Basic AI Drives

“The Basic AI Drives” is a paper by Stephen M. Omohundro, presented at the First Conference on Artificial General Intelligence in March 2008 and published in the conference proceedings (Frontiers in Artificial Intelligence and Applications, volume 171, IOS Press). It is one of the foundational statements of what is now called instrumental convergence.

Omohundro’s argument is that a sufficiently advanced system that acts to achieve goals will tend to develop a set of predictable sub-goals, or “drives,” regardless of what its ultimate objective happens to be - “unless explicitly counteracted.” His reasoning is that a system modeled as a rational economic agent, maximizing some utility function, will find certain instrumental behaviors useful for almost any final goal.

The drives he identified include the tendency to want to self-improve, to model and understand its own behavior, to represent its goals as a clear utility function, to protect that utility function from being changed, to guard the mechanisms that measure its success, to prevent itself from being shut off, and to acquire and use resources efficiently. The shutdown point is intuitive: a system cannot achieve its goal if it is turned off, so preserving its own operation becomes instrumentally valuable for nearly any objective.

The paper supplied much of the conceptual vocabulary for later AI safety work. Nick Bostrom drew on the same ideas in formulating the “instrumental convergence thesis” in his 2014 book Superintelligence, and the drive toward self-preservation directly motivated later technical work on corrigibility and shutdown, such as the off-switch game. The thought experiment of a goal-maximizing system pursuing resources to an extreme - the paperclip maximizer - is a popular illustration of the same convergence argument.