**Configuration drift** is unintended variation that accumulates over time across systems that are supposed to be identical (e.g., staging vs production, two app server instances). ## Causes - Manual changes applied to some instances but not others - Ad-hoc automation runs targeting specific instances - Separate code branches or copies of infrastructure code for different instances - Team members making independent decisions on similar tasks (e.g., one uses ext3 80GB, another uses XFS 100GB) ## Why It Matters Configuration drift erodes automation confidence. When systems diverge silently, you can no longer trust that code which works on one instance will work on another. Testing becomes unreliable. Incidents in production can't be reproduced in staging. ## Detection and Prevention - Continuous, unattended automation runs (Ansible, Chef, Puppet on schedule) keep systems synchronized with code — any manual deviation gets overwritten - Treat code as the system of record; never make changes outside of code - When drift is detected between environments, treat it as a defect, not a footnote See also: [[Automation Fear Spiral]] — the pattern that causes teams to avoid continuous automation runs, making drift worse.