What's Actually Happening

You run terraform plan expecting "No changes" but instead see resources being modified, recreated, or destroyed. This phantom drift occurs when Terraform detects differences between your state and real infrastructure, even though you haven't changed your configuration.

The Error You'll See

``` Terraform will perform the following actions:

# aws_instance.example will be updated in-place ~ resource "aws_instance" "example" { ~ ami = "ami-0c55b159cbfafe1f0" -> "ami-1a2b3c4d5e6f" ~ instance_state = "running" -> "stopped" id = "i-0123456789abcdef0" # (10 unchanged attributes hidden) }

# aws_lb.target will be replaced # (forces replacement) ~ resource "aws_lb" "target" { ~ access_logs { ~ bucket = "my-logs-bucket" -> "my-logs-bucket-new" # forces replacement } }

Plan: 1 to add, 1 to change, 1 to destroy. ```

Or sometimes just puzzling changes:

bash
# aws_s3_bucket.data has changed
  ~ resource "aws_s3_bucket" "data" {
        id                          = "my-unique-bucket-name"
      ~ versioning {
          ~ enabled = false -> true
        }
    }

Why This Happens

Understanding the root causes helps you fix the right problem:

  1. 1.Provider version updates - New provider versions may handle attributes differently
  2. 2.API changes - Cloud provider updated their API responses
  3. 3.Default value changes - Provider defaults changed between versions
  4. 4.State refresh issues - State wasn't properly refreshed
  5. 5.External modifications - Someone changed resources outside Terraform
  6. 6.Computed attributes - Attributes calculated by provider that differ from state
  7. 7.Terraform version upgrade - Core version changed attribute handling
  8. 8.Module updates - Upstream module changed default values

Step 1: Investigate the Drift

First, understand what's actually different:

```bash # Get detailed plan with full attributes terraform plan -out=tfplan terraform show -json tfplan | jq '.resource_changes[]'

# Or view the specific resource differences terraform plan -target=aws_instance.example

# Show what Terraform thinks the current state is terraform state show aws_instance.example

# Compare with actual resource aws ec2 describe-instances --instance-ids i-0123456789abcdef0 --output json ```

For the specific attribute causing the change:

bash
# Get the exact value difference
terraform plan -json | jq '.resource_changes[] | select(.address == "aws_instance.example") | .change.after'

Step 2: Check for External Modifications

See if resources were modified outside Terraform:

```bash # AWS CloudTrail for recent changes aws cloudtrail lookup-events \ --lookup-attributes AttributeKey=ResourceName,AttributeValue=i-0123456789abcdef0 \ --max-items 20

# Azure Activity Log az monitor activity-log list \ --resource-group my-rg \ --caller myuser@example.com \ --output table

# GCP Audit Logs gcloud logging read "resource.type=gce_instance AND protoPayload.resourceName=projects/my-project/zones/us-central1-a/instances/example" ```

Ask your team: "Did anyone manually modify the production load balancer yesterday?"

Step 3: Refresh State Against Reality

Sometimes the state just needs to sync with actual infrastructure:

```bash # Refresh state only (no plan) terraform refresh

# Then check if drift is resolved terraform plan

# If refresh doesn't help, force a state read terraform state pull > current-state.tfstate terraform state push current-state.tfstate ```

Step 4: Handle Provider Default Changes

When providers change their default handling:

```bash # Check your current provider version terraform version

# See provider version in state terraform providers ```

If you recently upgraded providers:

```hcl terraform { required_providers { aws = { source = "hashicorp/aws" version = "5.0.0" # Pin to avoid unexpected upgrades } } }

provider "aws" { region = "us-east-1"

# Explicitly set defaults that might change default_tags { tags = { Environment = "production" } } } ```

Run the upgrade properly:

bash
terraform init -upgrade
terraform plan

Step 5: Fix Specific Drift Patterns

For read-only computed attributes:

```hcl # Some attributes are read-only and shouldn't cause drift # Add lifecycle ignore_changes for computed values resource "aws_instance" "example" { ami = var.ami_id instance_type = "t3.micro"

# Ignore changes to attributes managed by AWS lifecycle { ignore_changes = [ ami, # AMI might be updated by AWS user_data, # Base64 encoding might differ root_block_device # Volume changes outside Terraform ] } } ```

For timestamp and hash values:

```hcl resource "aws_s3_object" "upload" { bucket = "my-bucket" key = "config.json" source = "local-config.json"

# ETag changes with file content, causing drift # Use source_hash instead for stable tracking etag = filemd5("local-config.json") } ```

For nested blocks that fluctuate:

```hcl resource "aws_security_group" "main" { name = "main-sg"

ingress { from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] }

# Ignore added/removed rules by external sources lifecycle { create_before_destroy = true } } ```

Step 6: Force State Consistency

When you need to manually align state with reality:

```bash # Mark a resource as needing recreation terraform state rm aws_instance.broken terraform import aws_instance.broken i-0123456789abcdef0

# For modules terraform state rm module.vpc terraform import module.vpc.aws_vpc.main vpc-12345678

# Taint a resource to force recreation terraform taint aws_instance.problematic terraform plan # Will show recreation ```

Step 7: Use Drift Detection Proactively

Implement regular drift detection in CI/CD:

```yaml # .github/workflows/drift-detection.yml name: Drift Detection on: schedule: - cron: '0 6 * * *' # Daily at 6 AM

jobs: drift: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: hashicorp/setup-terraform@v2

  • name: Terraform Init
  • run: terraform init
  • name: Check for Drift
  • run: |
  • terraform plan -detailed-exitcode
  • continue-on-error: true
  • name: Alert on Drift
  • if: steps.plan.outcome == 'failure'
  • run: |
  • echo "::warning::Infrastructure drift detected!"
  • # Send Slack notification
  • `

Verify the Fix

Confirm your plan is clean:

```bash terraform plan

# Should output: # No changes. Your infrastructure matches the configuration. ```

Run multiple times to ensure consistency:

bash
terraform plan
terraform plan  # Second run should be identical

Common Specific Fixes

S3 bucket versioning drift: ``bash # Manually import correct state terraform state rm aws_s3_bucket_versioning.main terraform import aws_s3_bucket_versioning.main my-bucket-name

IAM policy drift: ```bash # Get actual policy aws iam get-policy-version --policy-arn arn:aws:iam::123456789012:policy/MyPolicy --version-id v1

# Update Terraform to match or run apply ```

Load balancer listener rules: ```hcl # Use aws_lb_listener_rule instead of inline rules for better tracking resource "aws_lb_listener_rule" "api" { listener_arn = aws_lb_listener.main.arn priority = 100

action { type = "forward" target_group_arn = aws_lb_target_group.api.arn }

condition { path_pattern { values = ["/api/*"] } } } ```