Drift detection is CloudFormation's mechanism for catching out-of-band changes — modifications made to stack resources via the console, AWS CLI, or service SDKs without going through CFN. It's a comparison operation, not a continuous monitor: you run it, it compares, it reports. ## What "drift" means A **resource** has drifted when one or more of its properties: - Has a value that differs from the template definition - Has been deleted A **stack** has drifted when one or more of its resources have drifted. CFN compares **expected** values (template properties + parameter values) with **actual** values (live config from underlying service APIs). Stack-level tags are also tracked for drift. ## Status codes **Drift detection operation** (the act of running detection): - `DETECTION_IN_PROGRESS` / `DETECTION_COMPLETE` / `DETECTION_FAILED` **Drift status** (stack / stack instance / stack set): - `DRIFTED` — at least one resource differs - `IN_SYNC` — all supported resources match - `NOT_CHECKED` — never run (or no drift-supporting resources) **Resource drift status**: - `MODIFIED` — properties differ - `DELETED` — resource gone (template still expects it) - `IN_SYNC` — matches template - `NOT_CHECKED` — resource type doesn't support drift detection **Property difference types** (per drifted property): - `ADD` — value added to an array/list property - `REMOVE` — property removed from current config - `NOT_EQUAL` — value differs ## What drift detection MISSES This is where the gotchas hide: 1. **Default values**: CFN only checks **explicitly set** properties. If you rely on a resource property's default and someone changes it via the console, drift detection won't catch it. **Workaround**: explicitly set every property you care about, even to its default value. 2. **Nested stacks**: drift detection on a parent stack does **not** recurse into nested stacks. You must run detection on each nested stack separately. 3. **Specific properties never checked**: - `KMSKeyId` on any resource (KMS aliases create comparison ambiguity) - `AWS::Lambda::Function` `Code` (source code can't be compared back) - `AWS::IAM::User` `LoginProfile.Password` and other write-only properties (services don't return secret values) 4. **Cross-stack attachments**: when a resource in stack A attaches to a resource in stack B (e.g., `AWS::IAM::Policy` attaching to a role from another stack), CFN can't analyze the attachment relationship across stacks. May produce false drift signals. ## False positives Two common sources of phantom drift: 1. **Equivalent-but-not-identical values**: template says `1024MB`, actual returned as `1GB`. Both equal, drift detection flags as NOT_EQUAL. **Fix**: normalize the template value to whatever string the service returns. 2. **Service defaults injected into array properties**: the underlying service may auto-populate array entries (default rules, default mappings) that CFN reads as drift even though no human changed anything. ## Required permissions Beyond the CFN actions, the caller needs **read permission for every resource type** in the stack that supports drift detection: ``` cloudformation:DetectStackDrift cloudformation:DetectStackResourceDrift cloudformation:BatchDescribeTypeConfigurations + ec2:DescribeInstances (for any AWS::EC2::Instance) + rds:DescribeDBInstances (for any AWS::RDS::DBInstance) + s3:GetBucketTagging (for tag drift on AWS::S3::Bucket) ... etc per resource type ``` Missing one read permission causes that resource to land in `NOT_CHECKED`. ## Allowed stack states You can only run drift detection on stacks in: - `CREATE_COMPLETE` - `UPDATE_COMPLETE` - `UPDATE_ROLLBACK_COMPLETE` - `UPDATE_ROLLBACK_FAILED` Other states (in-progress, failed without rollback) are blocked. ## Resolving drift Three paths: 1. **Update the resource to match the template** — usually via the underlying service (revert the out-of-band change) 2. **Update the template to match the resource** — accept the live state as canonical, run a no-op stack update 3. **Import the resource** — if the resource was deleted and recreated outside CFN, use the import operation to bring the new resource ID under stack management For routine reconciliation, see [[CFN Drift-Aware Change Sets Three-Way Comparison]] — those are the modern automated path. ## Operational pattern Run drift detection on a schedule (CloudWatch Events / EventBridge → Lambda → DetectStackDrift) for production stacks. Alert on `DRIFTED`. The detection is async and may take minutes for large stacks; poll status until `DETECTION_COMPLETE`. False-positive rate is the main reason teams disable drift alerts. Mitigate by explicitly setting every property in templates and normalizing units. ## Related - [[CFN Drift-Aware Change Sets Three-Way Comparison]] - [[CFN Failure Rollback Behavior]] - [[CFN Update Behaviors and the Replacement Trap]]