Infrastructure drift is what happens when the actual state of running infrastructure no longer matches the state described in its configuration files. In a world of infrastructure as code, the declared configuration is supposed to be the single source of truth for how servers, networks, and cloud resources are arranged. But reality intrudes: an operator makes a manual change in a cloud console to fix an urgent incident, a second tool modifies the same resource, an API call is made out-of-band, or a previous change only partially applied. Each of these opens a gap between what the code says should exist and what actually exists.
Drift matters because it quietly undermines the guarantees that declarative tooling is supposed to provide. If the configuration no longer describes reality, then re-running automation can produce surprising results, environments stop being reproducible, and the audit trail in version control no longer reflects the true running state. Worse, drift often hides the very manual interventions that were made under pressure, so the knowledge of why a change was needed is lost.
Tools in this space detect drift by comparing declared state against observed reality. The Terraform CLI documentation describes how the plan command does this: by default it “reads the current state of any already-existing remote objects to make sure that the Terraform state is up-to-date,” then “compares the current configuration to the prior state and notes any differences.” That refresh-and-compare step is precisely drift detection - it surfaces modifications “performed through manual adjustments, other tools, or direct API calls.” Disabling the refresh “causes Terraform to ignore external changes, which could result in an incomplete or incorrect plan.”
Reconciliation is the other half of the story: once drift is detected, the system must decide whether to overwrite reality to match the declared state, or to update the declared state to match reality. Terraform offers a refresh-only planning mode for the latter case, producing “a plan whose goal is only to update the Terraform state and any root module output values to match changes made to remote objects outside of Terraform.” The documentation notes this is “useful if you’ve intentionally changed one or more remote objects outside of the usual workflow (e.g. while responding to an incident).”
Drift is the practical reason that declarative, convergent tooling exists rather than one-shot scripts. A convergent system can be run repeatedly and will, on each run, detect the difference between desired and actual state and correct it - the same idempotent reconciliation loop that configuration-management tools and Kubernetes controllers use. The discipline of regularly detecting and resolving drift, rather than letting it accumulate, is central to keeping infrastructure-as-code honest and immutable-infrastructure practices viable.