Have you ever wondered how Terraform manages to keep track of the resources you create with terraform and apply your changes to just those specific resources whenever you try to create and update resources?
If you have used terraform you would have noticed that every time you ran terraform plan
, terraform apply
or terraform destroy
Terraform was able to find the resources it created previously and update them accordingly. Howwww? How did Terraform know which resources it was supposed to manage? Your AWS account could have several different infrastructures deployed through a variety of ways (eg some manually, some via Terraform, some via the CLI, some via AWS SAM etc), so how does Terraform know the particular infrastructure it created and hence is responsible for?
In two words “Terraform State”
📌 What is Terraform State
Terraform maintains track of everything it creates in your cloud environments so that it will know what it created and can go back and make modifications for you if you need to update or remove something later. This is what is meant when they say that Terraform is a stateful application.
Terraform records information about what infrastructure it created in a Terraform state file. By default, when you run Terraform init
in the folder "/hello/world", it will create the file /hello/world/terraform.tfstate
. This file includes a customized JSON format that records a mapping between the Terraform resources in your configuration files and their actual representations.
Every time you run Terraform, it can fetch the latest status of your resources from your cloud account and compare that to what’s in your Terraform configurations to determine what changes need to be applied. In other words, the output of the plan command is a diff between the code on your computer and the infrastructure deployed in the real world, as discovered via IDs in the state file.
⚠️⚠️ Note
The state file format is a private API that is meant only for internal use within Terraform. You should never edit the Terraform state files by hand or write code that reads them directly. If for some reason you need to manipulate the state file—which should be a relatively rare occurrence—use the terraform import or terraform state commands.
📌 Benefits of the Terraform State File
✨ Idempotence
Whenever a Terraform configuration is applied, Terraform checks if there is an actual change made. Only the resources that are changed will be updated.✨ Deducing dependencies
Terraform maintains a list of dependencies in the state file so that it can properly deal with dependencies that no longer exist in the current configuration.✨ Performance
Terraform can be told to skip the refresh even when a configuration change is made. Only a particular resource can be refreshed without triggering a full refresh of the state, hence improving performance.✨ Collaboration
State keeps track of the version of an applied configuration, and it's stored in a remote, shared location. So collaboration is easily done without overwriting.✨ Auditing
Invalid access can be identified by enabling logging.✨ Safer storage
Storing state on the remote server helps prevent sensitive information.
📌 Limitations Teams Face with Terraform State Files
The above is all great and nice if you are working on your project alone, if you’re using Terraform for a personal project, you can get away with storing state in a single terraform.tfstate
file that lives locally on your computer. This will work just fine without any form of conflicts occurring. However, when working on a large project along side your team members, using Terraform, storing the terraform.tfstate
file will present the following problems:
- ✨ Shared storage for state files
To be able to use Terraform to update your infrastructure, each of your team members needs access to the same Terraform state files. That means you need to store those files in a shared location. If you're thinking of using version control such as Git, think again. Although you should definitely store your Terraform code in version control, storing Terraform state in version control is a bad idea as it would lead to a new set of problems like manual error, locking and secrets exposure.
Manual error refers to the ease which which one can forget to pull down the latest changes from version control before running Terraform or to push your latest changes to version control after running Terraform. It’s just a matter of time before someone on your team runs Terraform with out-of-date state files and as a result, accidentally rolls back or duplicates previous deployments. The problem of secrets exposure presents itself due to the fact that all data in Terraform state files is stored in plain text. This is a problem because certain Terraform resources need to store sensitive data.
- ✨ Locking state files
As soon as data is shared, you run into a new problem: locking. Without locking, if two team members are running Terraform at the same time, you can run into race conditions as multiple Terraform processes make concurrent updates to the state files, leading to conflicts, data loss, and state file corruption. Locking using version control such as Git is not possible as most version control systems do not provide any form of locking that would prevent two team members from running terraform apply on the same state file at the same time.
- ✨ Isolating state files
When making changes to your infrastructure, it’s a best practice to isolate different environments. For example, when making a change in a testing or staging environment, you want to be sure that there is no way you can accidentally break production. But how can you isolate your changes if all of your infrastructure is defined in the same Terraform state file?
📌 Using Terraform's Remote Backend.
The ideal method to handle shared storage for state files is to use Terraform's built-in support for remote backends rather than version control. How Terraform loads and saves state is determined by a backend. The local backend, which saves the state file on your local drive, is the default backend you use when you work on a project alone. You can keep the state file in a distant, shared store using remote backends. There are several remote backends available, including:
✨ HashiCorp's Terraform Cloud and
✨ Terraform Enterprise,
✨ Amazon S3,
✨ Azure Storage,
✨ Google Cloud Storage and others.
Remote backends solve the manual error, locking and secrets exposure errors we mentioned that could arise with the use of version control.
There is no danger of manual human mistake since, after configuring a remote backend, as each time you run plan or apply Terraform will automatically load the state file from that backend and will automatically put the state file in that backend after each apply.
In regards to the error posed as a result of locking you should note that locking is supported natively by the majority of remote backends. Terraform will automatically get a lock in order to execute terraform apply
; if someone else is currently running apply, they will already have the lock, and you will need to wait. Apply can be performed with the -locktimeout= <TIME>
argument to tell Terraform to wait for a lock to be released up to TIME (for example, -lock-timeout=10m
will wait for 10 minutes).
Most of the remote backends natively support encryption in transit and encryption at rest of the state file. Although the best solution would be for Terraform to natively support encrypting secrets within the state file, these remote backends reduce most of the security concerns, given that at least the state file isn’t stored in plain text on disk anywhere.
We would be using Amazon S3, AWS' managed file store. This is the best bet as a remote backend when working within an AWS environment for several reasons.
⚠️ NOTE
The easiest way, and might I say most effective way, and also quickest way to do this will be to manually create your S3 bucket and DynamoDB table via the management console or via the AWS CLI and then configure your backend in your terraform script. However that method introduces a level of manual configuration that could introduce manual error.
Even though the method I am about to show you makes the whole process automated and repeatable, it's limitation is that the creation of our resources must then be in a two step process:
- First we write Terraform code to create the S3 bucket and DynamoDB table and deploy that code with a local backend.
- Then we go back to the Terraform code, add a remote backend configuration to it to use the newly created S3 bucket and DynamoDB table, and run terraform init to copy your local state to S3.
If you ever want to delete the S3 bucket and DynamoDB table, you’d have to do this two-step process in reverse:
- First go to the Terraform code, remove the backend configuration, and rerun terraform init to copy the Terraform state back to your local disk.
- Then run terraform destroy to delete the S3 bucket and DynamoDB table.
✨ I personally always prefer to create my bucket and table manually.
Also you can use the same bucket and table for several different configurations but be careful when you do this so that you don't delete these resources and lose the state files of so many of your environments.
Follow along to see how you can use terraform to create the bucket and table where you want to store your Terraform state.
📌 Creating an S3 Remote Backend
To enable remote state storage with Amazon S3, the first step is to create an S3 bucket.
- ✨ Create a
main.tf
file in a new folder.
Use any code editor of your choice, eg vscode, Specify AWS as the provider at the top of the file:
provider "aws" {
region = "us-east-1"
}
✨ Next, create an S3 bucket by using the aws_s3_bucket resource:
resource "aws_s3_bucket" "tf-state" {
bucket = "tf-code-state"
#The below line of code will prevent accidental deletion of this bucket
lifecycle {
prevent_destroy = true
}
}
⚠️⚠️ Note that S3 bucket names must be globally unique among all AWS customers. Therefore, you will need to change the bucket parameter from "tf-code-state" (which I already created) to your own name. Make sure to remember this name and take note of what AWS region you’re using because you’ll need both pieces of information again a little later on.
📌 Adding Security Elements to our Remote Backend
Let’s now add several extra layers of protection to this S3 bucket.
✨ First, use the aws_s3_bucket_versioning
resource to enable
versioning on the S3 bucket so that every update to a file in the bucket actually creates a new version of that file. This allows you to see older versions of the file and revert to those older versions at any time, which can be a useful fallback mechanism if something goes wrong:
# Next set of lines of code will enable versioning so that we can see previous versions of our state file
resource "aws_s3_bucket_versioning" "enabled" {
bucket = aws_s3_bucket.tf-state.id
versioning_configuration {
status = true
}
}
✨ Next we will use the aws_s3_bucket_server_side_encryption_configuration
resource to turn server-side encryption on by default for all data written to this S3 bucket. This ensures that your state files, and any secrets they might contain, are always encrypted on disk when stored in S3:
# Now we enable server side encryption by default
resource "aws_s3_bucket_server_side_encryption_configuration" "dafault" {
bucket = aws_s3_bucket.tf-state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
✨ The very next thing we have to do is to block all public access to the s3 bucket using the aws_s3_bucket_public_access_block
resource. S3 buckets are private by default, but as they are often used to serve static content—e.g., images, fonts, CSS, JS, HTML—it is possible, even easy, to make the buckets public. Since your Terraform state files may contain sensitive data and secrets, it’s worth
adding this extra layer of protection to ensure no one on your team can ever accidentally make this S3 bucket public:
# Explicitly block all public access to the s3 bucket
resource "aws_s3_bucket_public_access_block" "public_access" {
bucket = aws_s3_bucket.tf-state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
📌 Create a DynamoDB Table for Locking
✨ The next thing we have to do is to create a DynamoDB table that we would use for locking. DynamoDB is Amazon’s distributed key–value store. It supports strongly consistent
reads and conditional writes, which are all the ingredients you need for a distributed lock system. Moreover, it’s completely managed, so you don’t have any infrastructure to run yourself, and it’s inexpensive, with most Terraform usage easily fitting into the free tier.
To use DynamoDB for locking with Terraform, you must create a
DynamoDB table that has a primary key called LockID (with this exact spelling and capitalization). You can create such a table using the aws_dynamodb_table
resource:
# Create dynamodb table for terraform state locking
resource "aws_dynamodb_table" "tf_state_lock" {
name = tf-state-lock
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
📌 Create Resources
Now we will create the table and the bucket.
We have configured our remote backend yet (remember the two step process we spoke about earlier?) so our statefile will be stored locally for now.
Run the commands below to initialize terraform, show the plan and create the resources:
terraform init
terraform plan
terraform apply
⚠️
Note that when you run terraform apply
without the -auto-approve
flag you will to prompted to either apply or decline the creation with a yes
or no
.
Type yes
to continue with the process.
📌 Configure the remote backend and move the state file
Now our bucket and DynamoDB table are both created, but we still have our statefile save locally, therefore we need to move it to these resources we just created.
To do this, first we will update our configuration, we need to add a backend configuration to our Terraform code. This configuration will be in the terraform
block as shown below.
At the top of your main.tf
add the below block of code:
terraform {
backend "s3" {
# Replace this with your bucket name!
bucket = "terraform-up-and-running-state"
key = "global/s3/<your s3 bucket name>"
region = "us-east-2"
# Replace this with your DynamoDB table name!
dynamodb_table = "<your dynamodb table name>"
encrypt = true
}
}
Now you simply have to rerun the terraform init
command again.
This will instruct Terraform to store your state file in this S3 bucket. When you run this command Terraform will automatically detect that you already have a state file locally and prompt you to copy it to the new S3 backend.
After running the command, your Terraform state will be stored in the S3 bucket. You can confirm that your state file is in your bucket by heading to the AWS Management Console, clicking on S3 bucket and clicking into your bucket.
Cleanup
To destroy your infrastructure and cleanup your environment you need to take out the backend configuration you just added, run terraform init
again and then run terraform destroy
.
This is so that the state files will be copied back into your local environment and you can delete the S3 buckets. This is only necessary because your configuration is the same configuration you used in creating the S3 bucket and Dynamodb table. Supposing you didn't create the backend S3 bucket and dynamodb table with this configuration you wouldn't need to remove the backend configuration from your setup.
Hope you found this helpful, if you did please share with your community and connect with me on LinkedIn and follow me on GitHub.
Follow for more DevOps content broken down into simple understandable structure.
Top comments (0)