What is a Terraform module? The simplest description is that a module is a collection of .tf
files located in the same directory. In fact, as we will see, we have been using modules all along.
In this lesson I will go through modules. This lesson covers section 5 of the Certified Terraform Associate exam curriculum. This part of the curriculum is outlined below:
Part | Content |
---|---|
5 | Interact with Terraform modules |
(a) | Contrast and use different module source options including the public Terraform Module Registry |
(b) | Interact with module inputs and outputs |
(c) | Describe variable scope within modules/child modules |
(d) | Set module version |
Motivation for modules
When you start working on a Terraform configuration you usually start with a single file named main.tf
1, and if your configuration is small enough this is the only file you will ever need.
Once your infrastructure grows you might start adding additional files to avoid having a single Terraform file of several hundred lines. A possible file structure for a moderately large Azure project might look like this:
$ tree .
.
├── compute.tf
├── data.tf
├── main.tf
├── network.tf
├── outputs.tf
└── variables.tf
0 directories, 6 files
Apart from main.tf
we see a few additional files. Two common ones are variables.tf
and outputs.tf
. These files are used to gather variables (your configuration input) and outputs (your configuration outputs), respectively. Then we have three additional files that group related resources together:
-
compute.tf
holds your compute resources, such as virtual machines or Azure Functions. -
data.tf
holds your data storage resources, such as storage accounts or Cosmos DBs. -
network.tf
holds your network resources, such as virtual networks and security groups.
Imagine now that you want to create two virtual networks, each containing a number of virtual machines and databases. The two networks should be more or less identical. Do you add the new network resources to network.tf
, and the new virtual machines to compute.tf
, and new databases to data.tf
? It would work, but it will be tedious to maintain, especially if you eventually want to add a third copy of everything.
Enter modules
! A module is a Terraform configuration that you can create multiple copies of. In the example from above we could create a module that has a virtual network and the compute and data resources that we want to create multiple instances of.
Reorganizing the file structure from before to use modules we might do something like this:
$ tree .
.
├── main.tf
├── modules
│ └── environment
│ ├── compute.tf
│ ├── data.tf
│ ├── main.tf
│ ├── network.tf
│ ├── outputs.tf
│ └── variables.tf
├── outputs.tf
└── variables.tf
2 directories, 9 files
In the modules
directory I have created a module named environment. In this module the file structure looks similar to what I had before, including a main.tf
, variables.tf
, outputs.tf
, etc. A good practice when designing modules is to hide the underlying details of the module, instead create an abstraction that describes the purpose of the module instead of the technical details of what is included in the module. In this case I call it my environment module, because it represents an environment for an application, I do not call it my compute-network-data-module.
What is left outside of the modules
directory is known as the root module. We always have at least one module, and that is the root module. The root module is where you run terraform init
, terraform plan
, and terraform apply
. From the root module you can include other modules. Like I did above, there is a convention to put your modules in a modules
directory, although this is not mandatory.
Now we have divided our configuration into a root module and an environment module, will my configuration work magically now? In the next section we will look at how we tie our root module and an additional module together, and how we can create multiple instances of a module.
Using a module
What does a module
block in HCL look like? The general format is:
module "local_name" {
source = "./path/to/module"
argument1 = <expression>
argument2 = <expression>
...
}
The module block has a single label defining the local name of the module. This name is used to refer to the module from other parts of your Terraform configuration.
The only required argument is the source
argument. This argument either points to a local directory where the module is located or it could point to a remote module that Terraform should download. More about remote modules in the Terraform registry later in this lesson.
If the module has variable
blocks defined, then these can be populated with values through correspondingly named arguments in the module
block. If the module has output
blocks defined, then the values of the outputs can be accessed by referring to the module by module.<local name>.<output name>
. We'll see examples of both input arguments and output values later.
For simplicity let us assume we have a single file in our module, and we'll call it main.tf
. In main.tf
we define a single variable
and a single output
, this will allow us to interact with the module through inputs and outputs. Here is the full definition of a module named favorite_number_module:
// modules/favorite_number_module/main.tf
variable "my_favorite_number" {
type = number
}
output "double_my_favorite_number" {
value = var.my_favorite_number * 2
}
In our root module we can now create an instance of this module using the module
block:
// main.tf
module "number_module" {
source = "./modules/favorite_number_module"
my_favorite_number = 8
}
If we want to create multiple copies of the module we can either create several module
blocks:
// main.tf
module "first_number_module" {
source = "./modules/favorite_number_module"
my_favorite_number = 8
}
module "second_number_module" {
source = "./modules/favorite_number_module"
my_favorite_number = 9
}
Or, we could use the meta arguments count
or for_each
that we saw for resource
blocks in a previous lesson:
// main.tf
module "number_module" {
count = 3
source = "./modules/favorite_number_module"
my_favorite_number = count.index
}
To access the output
value from our module we can write module.number_module.double_my_favorite_number
.
How to structure modules?
The difficult thing about modules is not to understand how they work, after all we have been using modules all along (the root module). The difficult part is usually to know what modules we should create and how to split a large configuration into modules. There is no single correct answer to this question except for "it depends".
If you have a number of resources that are closely related and could make up a good abstraction, then they are good candidates for a module. An example could be if you have a website consisting of a web server, a database, a dns name, and a few other bits and pieces. That would make up a good website abstraction, that would fit very well in a module.
Another obvious example is if you have a certain part of your configuration that is repeated several times, then that is a good candidate to become a module. This could include a situation where you want to deploy your infrastructure to multiple cloud regions for instance, allowing you to do something like this:
module "europe" {
source = "./modules/environment"
region = "eu-north-1"
}
module "america" {
source = "./modules/environment"
region = "us-east-1"
}
One last idea I have seen in the wild is to have modules that correspond to different environments. So you might have a module for your stage environment, and another module for your production environment. This is a valid approach, and could be useful if you want to have a single configuration for all your environments. However, this approach only makes sense if there are significant differences between your environments, otherwise a single module with a number of cleverly selected input arguments should be a better approach.
Using the public registry
What I did above was to use a local module. We have seen the public Terraform registry when we looked at providers, but the registry also has many modules that we can use.
Let us use an example from AWS to see how we can use a publicly available module. When setting up a virtual network (or Virtual Private Cloud, VPC) in AWS there are a lot of resources you must create. One popular module is the AWS VPC module. The documentation for this module is available at registry.terraform.io/modules/terraform-aws-modules/vpc/aws. The simplest example of using this module looks like this:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
}
Just like with providers, a module has a version number. You should look in the documentation for the module you want to use what the current version number is. Apart from the version number we must also specify the source
argument, which specifies which module to use. When we used local modules above the source
argument started with ./
, this is how Terraform knows it is a local module (it would also be valid to start with ../
). When using a module from the public registry the format of the source
is <NAMESPACE>/<NAME>/<PROVIDER>
. In this example the NAMESPACE
is terraform-aws-modules
, the NAME
is vpc
and the PROVIDER
is aws
.
Apart from local modules and modules from the public registry you could also store your own modules in various locations, e.g. GitHub, Bitbucket, S3 buckets in AWS, or GCS buckets in Google Cloud. I do not include any example of how this is done, it is very similar to what we have seen, but in my journey with Terraform I have not used any other module source (and neither did I get any questions on that during the certification exam).
Let's run terraform init
for the sample configuration above with the AWS VPC module. Before I run terraform init
I have the following file structure:
$ tree .
.
└── main.tf
0 directories, 1 file
Then I run terraform init
:
$ terraform init
Initializing modules...
Downloading registry.terraform.io/terraform-aws-modules/vpc/aws 3.19.0 for vpc...
- vpc in .terraform/modules/vpc
... (output truncated) ...
Terraform has been successfully initialized!
My directory structure after running terrform init
looks like this:
$ tree -a -L 4 .
.
├── .terraform
│ ├── modules
│ │ ├── modules.json
│ │ └── vpc
│ │ ├── .editorconfig
│ │ ├── .git
│ │ ├── .github
│ │ ├── .gitignore
│ │ ├── .pre-commit-config.yaml
│ │ ├── .releaserc.json
│ │ ├── CHANGELOG.md
│ │ ├── LICENSE
│ │ ├── README.md
│ │ ├── UPGRADE-3.0.md
│ │ ├── examples
│ │ ├── main.tf
│ │ ├── modules
│ │ ├── outputs.tf
│ │ ├── variables.tf
│ │ ├── versions.tf
│ │ └── vpc-flow-logs.tf
│ └── providers
│ └── registry.terraform.io
│ └── hashicorp
├── .terraform.lock.hcl
└── main.tf
10 directories, 16 files
I limited the depth of the tree
command because it was a lot of content. We see that the module has been downloaded to a local directory .terraform/moldules/vpc
from the public registry.
Summary
In this lesson we did a deep-dive into Terraform modules!
In summary we learned:
- We always have at least one module in all our Terraform configurations, and that is the root module.
- Modules are a way to group related resources together in order to be able to create multiple copies of the resources in a module.
- We saw how to declare a
module
block in HCL. - We saw how we provide values for
variable
blocks in a module through arguments in themodule
block. - We saw how we can access values from
output
blocks in modules throughmodule.<local mame>.<output name>
. - We learned how we can use publicly available modules from the Terraform registry.
There are a few minor details I skipped, for instance about upgrading a module from one version to another, and some details about what the file .terraform/modules/modules.json
is, but these details are a bit unnecessary for the certification exam.
-
The name
main.tf
is a convention, but it could be named almost anything. However, it would be wise to stick to the convention! ↩
Top comments (0)