The Challenge
When demoing the Terraform Provider for SAP BTP and talking about the advantages of Infrastructure as Code, one important aspect is that Terraform does not only provision the infrastructure but covers the management of the resources like updating configurations. While this is technically straightforward namely you update and apply the configuration, there are some challenges when doing this in a CI/CD setup.
Let us assume that we have a perfect setup in place: the configurations are stored in a source code management system, and we leverage the a pull-based workflow in the code repository. After a successful review and merge, we have the new configuration in the main branch of the repository and will apply it to the target environment. The application via a CI/CD pipeline from a technical perspective is easy. However, what if we missed a glitch in the new configuration and unfortunately some central resources are getting deleted and created again? Let's say for example an SAP HANA Cloud instance. Oopsie! This would not be received well, I guess. Or rephrasing that in a more graphical way:
How can we handle this? Are there any mechanisms to prevent or at least to some extent safeguard this kind of issues without falling back to a manual workflow? There is. One huge advantage of sticking to (de-facto) standards like Terraform is that first we are probably not the first ones to come up with this question and second there is a huge ecosystem around Terraform that might help us with such challenges.
And for this specific scenario the solution is the Open Policy Agent. Let us take a closer look how the solution could look like.
Note This topic is not specific to the SAP BTP but is an open topic for Terraform in general. While this blog post focuses on SAP BTP, you can easily exchange the examples with other Terraform providers.
The Solution
First some words about the Open Policy Agent. The Open Policy Agent (OPA) is a general-purpose policy engine that enables policy enforcement agnostic of the stack you are using. OPA provides a query language called Rego. Citing the official documentation "[...]Rego queries are assertions on data stored in OPA. These queries can be used to define policies that enumerate instances of data that violate the expected state of the system. [...]"
That sounds like what can help us. Thinking about our Terraform scenario it would be great if we could evaluate the terraform plan
result of the new configuration and check if some unexpected things are happening. If some limits that we define in the policy are exceeded (like number of deleted resources of type XYZ), we would stop the automatic processing via the CI/CD pipeline and let a human look at the setup.
Luckily this scenario is also covered by a tutorial on the OPA website that you can find here. The tutorial deals with AWS as an example but making the adjustments for SAP BTP are straightforward. In the following section we will walk through the different components of the solution and see how it works in action.
The Components
You find the complete code for this example in the GitHub repository https://github.com/btp-automation-scenarios/btp-terraform-opa. In the following sections take a closer look at the code of the different solution components.
The Repository
For this showcase we will use a repository on GitHub to store our code (Terraform as well as the OPA Policy). The layout for the repository is as shown in the screenshot:
We have a infra
folder that contains our Teraform configuration. The policy
folder contains the OPA policy. And the .github
folder i.e. the workflows
folder therein contains the GitHub Actions configuration for the OPA check run.
In addition, we will use GitHub Actions to run the CI/CD pipeline. We will look at the configuration in the corresponding section.
The Terraform configuration
The basis for the setup is a Terraform configuration that you find in the infra
folder (you find the complete code here). We do a very simple setup defined in the main.tf file:
###
# Setup of names in accordance with the company's naming conventions
###
locals {
project_subaccount_name = "${var.org_name} | ${var.project_name}: CF - ${var.stage}"
project_subaccount_domain = lower(replace("${var.org_name}-${var.project_name}-${var.stage}", " ", "-"))
project_subaccount_cf_org = replace("${var.org_name}_${lower(var.project_name)}-${lower(var.stage)}", " ", "_")
}
###
# Creation of subaccount
###
resource "btp_subaccount" "project" {
name = local.project_subaccount_name
subdomain = local.project_subaccount_domain
region = lower(var.region)
labels = {
"stage" = ["${var.stage}"],
"costcenter" = ["${var.costcenter}"]
}
usage = "NOT_USED_FOR_PRODUCTION"
}
###
# Assignment of entitlements
###
resource "btp_subaccount_entitlement" "entitlements" {
for_each = {
for index, entitlement in var.entitlements :
index => entitlement
}
subaccount_id = btp_subaccount.project.id
service_name = each.value.name
plan_name = each.value.plan
}
One subaccount and a list of entitlements are created. The entitlements are defined in the variables.tf
file:
variable "entitlements" {
type = list(object({
name = string
plan = string
amount = number
}))
description = "List of entitlements for the subaccount."
default = [
{
name = "alert-notification"
plan = "standard"
amount = null
},
{
name = "SAPLaunchpad"
plan = "standard"
amount = null
},
{
name = "hana-cloud"
plan = "hana"
amount = null
},
{
name = "hana"
plan = "hdi-shared"
amount = null
},
{
name = "sapappstudio"
plan = "standard-edition"
amount = null
}
]
}
Not a very complex setup, but enough to show the concept. For the sake of this example, we do not even create any resources but will execute the check in the initial setup. Let's move on to the heart of the solution, the OPA policy.
Definition of the Policy
We define the policy in the folder policy
as terrraform.rego
. According to the rego
language we define a package and add some basic import:
package terraform.analysis
import rego.v1
import input as tfplan
This the rego.v1
import is a specific opt-in described here. In addition, we define the import of the Terraform plan (result of the terraform plan
command) as input for the further processing.
Next, we define some parameters for the policy:
# acceptable score for automated authorization
blast_radius := 30
# weights assigned for each operation on each resource-type
weights := {
"btp_subaccount": {"delete": 100, "create": 10, "modify": 1},
"btp_subaccount_entitlement": {"delete": 10, "create": 1, "modify": 5},
}
# Consider exactly these resource types in calculations
resource_types := {"btp_subaccount", "btp_subaccount_entitlement"}
We define a blast radius
that represents an acceptable score for automated execution of the Terraform configuration. To be able to calculate the score we define weights
for each Terraform operation and resource type.
Now we must calculate the score for the Terraform plan that we provided as input. For that we define some functions that help us to calculate the number of creations, deletions, and modifications for each resource type:
# list of all resources of a given type
resources[resource_type] := all if {
some resource_type
resource_types[resource_type]
all := [name |
name := tfplan.resource_changes[_]
name.type == resource_type
]
}
# number of creations of resources of a given type
num_creates[resource_type] := num if {
some resource_type
resource_types[resource_type]
all := resources[resource_type]
creates := [res | res := all[_]; res.change.actions[_] == "create"]
num := count(creates)
}
# number of deletions of resources of a given type
num_deletes[resource_type] := num if {
some resource_type
resource_types[resource_type]
all := resources[resource_type]
deletions := [res | res := all[_]; res.change.actions[_] == "delete"]
num := count(deletions)
}
# number of modifications to resources of a given type
num_modifies[resource_type] := num if {
some resource_type
resource_types[resource_type]
all := resources[resource_type]
modifies := [res | res := all[_]; res.change.actions[_] == "update"]
num := count(modifies)
}
There is quite some rego
magic going on that you find in the official documentation. As you see we are using the tfplan
input to extract the resources and the operations of the Terraform plan to identify the number of create, update, and delete actions.
Finally, we define the policy itself:
# Authorization holds if score for the plan is acceptable, and no changes are made to IAM
default autoexec := false
autoexec if {
score < blast_radius
}
# Compute the score for a Terraform plan as the weighted sum of deletions, creations, modifications
score := s if {
all := [x |
some resource_type
crud := weights[resource_type]
del := crud.delete * num_deletes[resource_type]
new := crud.create * num_creates[resource_type]
mod := crud.modify * num_modifies[resource_type]
x := (del + new) + mod
]
s := sum(all)
}
This snippet calculates the overall score and checks if the score is below the defined blast_radius
. If the score is below the threshold the autoexec
"decision" is set to true
and the Terraform plan could be executed automatically. We also have the score available as a "decision" when evaluating the policy.
You find the code of the policy here.
Integration with the CI/CD workflow
We must bring the bits and pieces together. We use a GitHub Actions workflow to run the OPA check. The configuration is as follows:
name: Evaluate Open Policy Agent for Terraform
on:
workflow_dispatch:
inputs:
PROJECT_NAME:
description: "Name of the project"
required: true
default: "sample-proj-opa"
REGION:
description: "Region for the sub account"
required: true
default: "eu10"
COST_CENTER:
description: "Cost center for the project"
required: true
default: "1234567890"
STAGE:
description: "Stage for the project"
required: true
default: "DEV"
ORGANIZATION:
description: "Organization for the project"
required: true
default: "B2B"
env:
PATH_TO_TFSCRIPT: 'infra'
jobs:
execute_base_setuup:
name: BTP Subaccount Setup
runs-on: ubuntu-latest
steps:
- name: Check out Git repository
id: checkout_repo
uses: actions/checkout@v4
- name: Setup Terraform
id : setup_terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_wrapper: false
terraform_version: latest
- name: Setup Open Policy Agent
id: setup_opa
uses: open-policy-agent/setup-opa@v2
with:
version: latest
- name: Terraform Init
id: terraform_init
shell: bash
run: |
terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} init -no-color
- name: Terraform plan
id: terraform_plan
shell: bash
run: |
export BTP_USERNAME=${{ secrets.BTP_USERNAME }}
export BTP_PASSWORD=${{ secrets.BTP_PASSWORD }}
terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} plan -var globalaccount=${{ secrets.GLOBALACCOUNT }} -var region=${{ github.event.inputs.REGION }} -var project_name=${{ github.event.inputs.PROJECT_NAME }} -var stage=${{ github.event.inputs.STAGE }} -var costcenter=${{ github.event.inputs.COST_CENTER }} -var org_name=${{ github.event.inputs.ORGANIZATION }} -no-color --out tfplan.binary
terraform -chdir=${{ env.PATH_TO_TFSCRIPT }} show -json tfplan.binary > tfplan.json
- name: Execute OPA policy
id: execute_opa
shell: bash
run: |
autoexec=$(opa exec --decision terraform/analysis/autoexec --bundle policy/ tfplan.json | jq '.result[].result')
score=$(opa exec --decision terraform/analysis/score --bundle policy/ tfplan.json | jq '.result[].result')
echo "Automatic execution possible (true/false): ${autoexec}"
echo "Score of change: ${score}"
For the sake of the demo, we use the workflow_dispatch
event to trigger the workflow. The user can provide input parameters for the Terraform configuration. We use the predefined Actions hashicorp/setup-terraform@v3
and open-policy-agent/setup-opa@v2
to setup Terraform as well as OPA. Make sure to set the terraform_wrapper
to false
in the Terraform setup to be able to use the terraform
command directly.
After initializing the Terraform configuration via terraform init
we run the terraform plan
command and store the plan via the -out
parameter as tfplan.binary
. OPA would not be able to work with the plan in this format, so we must convert it to JSON. We do this via the terraform show -json
command and store the plan as tfplan.json
.
Now we set the stage for the OPA policy execution. We use the opa exec
command to run the policy and provide the Terraform plan as input. To be specific we execute two decisions namely the result of the autoexec
and the score
. The opa exec
would return a JSON object. For the sake of readability, we parse the result via jq
and print the result to the console.
That's it. What will the result be? Let's see it in action.
Let's see it in action
The execution of the GitHub Action acting looks like this:
So, the initial application of the configuration would be executed automatically as the score is 15 and below the defined threshold of 30. Let us validate if this is correct:
We have the creation of the subaccount which counts as 10 points. In addition, we create 5 entitlements which count as 5 points. The total score is 15. The code is working as expected. Great!
The Conclusion
Terraform is a great tool for provisioning and managing your infrastructure. Automating the corresponding processes is technically quite easy but the management aspects need some more aspects to be considered. The Open Policy Agent comes in quite handy here as a general-purpose policy engine that can help with overcoming challenges in the automation of the setup. The integration of the tools is quite straightforward as the example of the blog post shows. Not being an expert in the rego
language it is obviously quite powerful, but another topic to learn; and I am honest, I was quite happy to have a running example that I could take from the tutorials area of OPA and just make minor adjustments.
While the sample I presented here is very simple, it clearly shows how to deal with the challenge of automating your automated infrastructure management.
With that ... happy Terraforming!
Top comments (0)