Starting up...
0%
Back to all articles
Terraform State Management: Why Remote Backends Are Critical
Terraform March 13, 2025

Terraform State Management: Why Remote Backends Are Critical

Kabir Dikko
DevOps Engineer

Terraform State Management: Why Remote Backends Are Critical

When working with Terraform in production environments, one of the most critical decisions you’ll make is how to manage your state files. While Terraform makes it easy to get started with local state files, this approach quickly becomes problematic in team environments and can lead to serious infrastructure issues.

Understanding Terraform State

Terraform state is a JSON file that maps the resources in your configuration to real-world infrastructure. It serves several critical purposes:

  • Resource Tracking: Maps configuration to real-world resources
  • Metadata Storage: Tracks resource dependencies
  • Performance Optimization: Caches resource attributes to improve plan operations
  • Collaboration: Enables team members to work on the same infrastructure

By default, Terraform stores state locally in a file named terraform.tfstate. This works fine for personal projects or learning, but introduces significant risks in production environments.

The Dangers of Local State

Using local state files in production can lead to several serious problems:

1. State File Loss

Local state files can be accidentally deleted, corrupted, or lost when stored on a developer’s machine. Without proper state, Terraform loses track of what resources it’s managing, potentially leading to:

  • Duplicate resource creation
  • Inability to modify existing resources
  • Manual cleanup requirements

2. Concurrent State Modifications

When multiple team members run Terraform against the same infrastructure with local state files, race conditions occur:

Developer A                     Developer B
-----------                     -----------
terraform plan                  
                               terraform plan
terraform apply                 
                               terraform apply (overwrites A's changes)

This leads to state file conflicts, lost updates, and infrastructure drift.

3. Sensitive Information Exposure

Terraform state contains all resource attributes, including sensitive ones like database passwords and private keys. Local state files are typically unencrypted and may be accidentally committed to version control.

Remote Backends: The Solution

Remote backends solve these problems by storing state in a shared, remote location with proper locking mechanisms. The most common options include:

  • AWS S3 with DynamoDB: S3 for state storage, DynamoDB for state locking
  • Azure Blob Storage: Native Azure solution with locking support
  • Google Cloud Storage: GCP’s equivalent option
  • Terraform Cloud/Enterprise: HashiCorp’s managed solution with additional features
  • Consul: Self-hosted distributed key-value store

Implementing a Remote Backend with S3 and DynamoDB

Here’s how to configure an S3 backend with DynamoDB locking:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

This configuration:

  • Stores state in an S3 bucket with encryption
  • Uses a DynamoDB table for state locking
  • Organizes states by environment using different keys

Real-World Example: VPC Cleanup Utility

In my VPC Cleanup Utility project, I encountered a perfect example of why remote state is critical. This utility helps clean up orphaned AWS VPC resources, which often result from improperly managed Terraform state.

The Problem: Duplicate VPCs

One of the most common issues I’ve faced was the proliferation of duplicate VPCs across AWS accounts. In one particular project, our team ended up with 7 identical VPCs in the same region, all with similar CIDR blocks and naming patterns. How did this happen?

The root cause was improper state management:

  1. A developer would create infrastructure using local state
  2. The state file would be lost when they switched machines or cleaned up their workspace
  3. When they needed to make changes, Terraform couldn’t find the state file
  4. Instead of recovering the state, they would run terraform apply again
  5. Result: A completely new VPC with all associated resources was created

Each duplicate VPC incurred costs for NAT gateways, unused Elastic IPs, and other resources. More concerning was the security risk of having untracked network infrastructure.

The Cleanup Challenge

Cleaning up these resources manually is tedious and error-prone. AWS has dependencies between resources that must be deleted in the correct order. For example, you can’t delete a VPC until you’ve removed all:

  • EC2 instances
  • Load balancers
  • NAT gateways
  • Network interfaces
  • Security groups (except the default)
  • Subnets
  • Route tables (except the main)
  • Internet gateways

My VPC Cleanup Utility includes a script similar to this one to handle the complex deletion process:

#!/usr/bin/env bash

# First, identify and delete network interfaces
aws ec2 describe-network-interfaces \
  --filters "Name=vpc-id,Values=${VPC_ID}" \
  --query "NetworkInterfaces[?Status=='available'].NetworkInterfaceId" \
  --output text | \
  xargs -r -n1 aws ec2 delete-network-interface --network-interface-id

# Then handle security groups (excluding default)
aws ec2 describe-security-groups \
  --query "SecurityGroups[?GroupName != 'default' && VpcId=='${VPC_ID}'].GroupId" \
  --output text | \
  xargs -r -n1 aws ec2 delete-security-group --group-id

This script is just a small part of the cleanup process. The full utility handles all dependent resources in the correct order.

The Prevention: Remote State

The project includes a Terraform module that:

  1. Sets up proper remote state configuration
  2. Implements state locking to prevent concurrent modifications
  3. Includes state backup mechanisms
module "terraform_state" {
  source = "./modules/terraform_state"
  
  bucket_name      = "company-terraform-state"
  dynamodb_table   = "terraform-locks"
  enable_versioning = true
  enable_encryption = true
}

By implementing proper state management from the beginning, you can avoid the need for cleanup utilities altogether.

Lessons Learned

This experience taught me several valuable lessons:

  1. Document your state location: Ensure all team members know where state is stored
  2. Implement state recovery procedures: Have a process for recovering or importing resources into state
  3. Use resource naming conventions: Include environment or team identifiers to spot duplicate resources
  4. Regular infrastructure audits: Periodically review your cloud resources against your Terraform state

Best Practices for Terraform State Management

Based on my experience, here are key recommendations for state management:

1. Use Remote Backends from Day One

Even for small projects, set up a remote backend immediately. The effort is minimal compared to the potential problems it prevents.

2. Implement State Locking

Always use a backend that supports locking to prevent concurrent modifications.

3. Enable Encryption

Ensure your state is encrypted both in transit and at rest to protect sensitive information.

4. Implement Access Controls

Restrict access to your state files using appropriate IAM policies or access controls.

5. Use Workspaces or Separate State Files

For multi-environment deployments, use either:

  • Terraform workspaces for smaller projects
  • Separate state files (different keys/paths) for larger projects

6. Back Up Your State

Even with remote backends, implement regular backups of your state files.

Conclusion

Proper state management is not optional for production Terraform deployments—it’s essential. Remote backends provide the collaboration, security, and reliability features needed for professional infrastructure management.

By implementing remote backends early in your Terraform journey, you’ll avoid many common pitfalls and build a more robust infrastructure management process.

Remember: The small upfront investment in proper state management will save you from potentially catastrophic infrastructure problems down the line.

Share this article