Khang Tran

Posted on Jul 11, 2024

Creating an Auto-Scaling Web Server Architecture

Since completing the AWS Cloud Resume Challenge, I've been more curious about Terraform. Today, I'll be using Terraform to create AWS architecture, containing Public Subnets, Private Subnets, Application Load Balancer (ALB), and Auto Scaling Group (ASG) for EC2 instances. The ASG scale instances up or down based on specific CPU usage thresholds.

This type of process is crucial when trying to cut costs for a business.

To start the project, I created another repository on Github and cloned it to my local computer.

I created a main.tf file:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
provider "aws" {
  region     = "us-east-1"
}

I made sure to define my environment variables in the .bashrc file.

Run:

nano ~/.bashrc

and define your variables

export AWS_ACCESS_KEY_ID = "<your aws user access key>"
export AWS_SECRET_ACCESS_KEY = "<your aws user secret key>"

After saving the file, the file needs to be reloaded for the variables to be accessible.

To re-load run:

source ~/.bashrc

The variables have to be defined whenever a new bash session is created.

Defining the varuables in the bashrc script means we can remove these lines from our file:

access_key = "AWS_ACCESS_KEY_ID"
secret_key = "AWS_SECRET_ACCESS_KEY"

because Terraform is able to pull your AWS credentials directly from the .bashrc script.

To create a vpc, add this to main.tf:

# Create a VPC
resource "aws_vpc" "example" {
  cidr_block = "10.0.0.0/16"
}

After running commands:

terraform init
terraform apply

I see that Terraform as completed creating my VPC.

I check my console to make sure it was created.

The ID's match up so Terraform is configured correctly. One thing to note, the name "example" is just an identifier for the resource by Terraform. If we want to name the VPC we would have to include a tag for the resource.

resource "aws_vpc" "example" {
  cidr_block = "10.0.0.0/16"

  tags = {
    Name = "example-vpc"
  }
}

We can see here, that we don't have any subnets. We want to make 3 public and 3 private subnets

Here is how to implement them

# Subnets
resource "aws_subnet" "public_1" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "public_2" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.2.0/24"
  availability_zone = "us-east-1b"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "public_3" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.3.0/24"
  availability_zone = "us-east-1c"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "private_1" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.4.0/24"
  availability_zone = "us-east-1a"
}

resource "aws_subnet" "private_2" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.5.0/24"
  availability_zone = "us-east-1b"
}

resource "aws_subnet" "private_3" {
  vpc_id            = aws_vpc.example.id
  cidr_block        = "10.0.6.0/24"
  availability_zone = "us-east-1c"
}

Having multiple subnets in different availability zones provides high availability in case EC2 instances are shutdown for any reason.

Note that this line that the subnets are created in the correct VPC with this line

vpc_id            = aws_vpc.example.id

The "example" is just the variable name we provided for our VPC earlier.

Next, I created an internet gateway

# Internet Gateway
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.example.id
}

Next I create a route table and configure outbound traffic to be directed to the internet gateway that was just created.

# Route Table for Public Subnets
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.example.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

# Route Table Associations for Public Subnets
resource "aws_route_table_association" "public_1" {
  subnet_id      = aws_subnet.public_1.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "public_2" {
  subnet_id      = aws_subnet.public_2.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "public_3" {
  subnet_id      = aws_subnet.public_3.id
  route_table_id = aws_route_table.public.id
}

The Route Table Associations resources associates the route table with the 3 public subnets.

So to summarize

An internet gateway was created to connext the VPC to the internet.
The route table was created make all outbound traffic direct towards the internet gateway.
The aws_route_table_association resources link the public subnets to the route table. This ensures that traffic from instances within the subnets is directed to the internet gateway.

Now, we have to create a security group

# Security Group
resource "aws_security_group" "web" {
  vpc_id = aws_vpc.example.id

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

The security group is specified as "web", and configured to the "example" vpc.

The ingress rules allows incoming traffic on port 80 and specifies the TCP protocol. The cidr is specified to "0.0.0.0/0" so it will allow incoming HTTP traffic from anywhere

The egress rule allows all outbound traffic from the instances associated with this security group. This is a common default setting that permits instances to initiate connections to any destination.

Next we specify a User Data script

# EC2 User Data Script
data "template_file" "userdata" {
  template = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd
              echo "Hello World from $(hostname -f)" > /var/www/html/index.html
            EOF
}

The user data script is used to bootstrap the EC2 instance with necessary configurations and software installations when it first starts. In this case, it installs and configures an Apache web server and sets up a simple "Hello World" web page.

# Launch Configuration
resource "aws_launch_configuration" "web" {
  name          = "web-launch-configuration"
  image_id      = "ami-0b72821e2f351e396" # Amazon Linux 2 AMI
  instance_type = "t2.micro"
  security_groups = [aws_security_group.web.id]

  user_data = data.template_file.userdata.rendered

  lifecycle {
    create_before_destroy = true
  }
}

This Terraform configuration defines an AWS Launch Configuration named "web-launch-configuration" for creating EC2 instances with specific settings. It specifies the use of the Amazon Linux 2 AMI (identified by the image_id "ami-0c55b159cbfafe1f0") and sets the instance type to "t2.micro". The EC2 instances launched with this configuration will use the security group referenced by aws_security_group.web.id. Additionally, a user data script, defined in the template_file data source, will be executed upon instance launch to install and start a web server. The lifecycle block ensures that new instances are created before the old ones are destroyed during updates, minimizing downtime.

# Auto Scaling Group
resource "aws_autoscaling_group" "web" {
  vpc_zone_identifier = [aws_subnet.private_1.id, aws_subnet.private_2.id, aws_subnet.private_3.id]
  launch_configuration = aws_launch_configuration.web.id
  min_size             = 1
  max_size             = 3
  desired_capacity     = 1

  tag {
    key                 = "Name"
    value               = "web"
    propagate_at_launch = true
  }
}

This Auto Scaling Group specifies that EC2 instances should be launched in the identified three private subnets. It maintains a minimum of 1 instance, scales up to a maximum of 3 instances based on scaling policies, and starts with a desired capacity of 1 instance. The instances are launched using the specified launch configuration.

# Application Load Balancer
resource "aws_lb" "web" {
  name               = "web-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.web.id]
  subnets            = [aws_subnet.public_1.id, aws_subnet.public_2.id, aws_subnet.public_3.id]
}

resource "aws_lb_target_group" "web" {
  name        = "web-tg"
  port        = 80
  protocol    = "HTTP"
  vpc_id      = aws_vpc.example.id
  target_type = "instance"
}

resource "aws_lb_listener" "web" {
  load_balancer_arn = aws_lb.web.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web.arn
  }
}

This Terraform configuration sets up an Application Load Balancer (ALB) named "web-alb" that is publicly accessible (internal = false) and uses the specified security group and public subnets. It also creates a target group named "web-tg" to route HTTP traffic on port 80 to instances within the specified VPC, and an ALB listener that listens for HTTP traffic on port 80, forwarding it to the target group. This configuration ensures that incoming HTTP traffic is balanced across the EC2 instances registered in the target group.

resource "aws_autoscaling_attachment" "asg_attachment" {
  autoscaling_group_name = aws_autoscaling_group.web.name
  lb_target_group_arn   = aws_lb_target_group.web.arn
}

The above resource attaches the ASG to the ALB's target group. This makes sure that the instances managed by the ASG are automatically registered with the ALB.


Next are two CloudWatch Alarms. These alarms trigger if CPU usage is over 75% or below 20% for longer than 30 seconds.


# CloudWatch Alarms
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
  alarm_name                = "high-cpu-utilization"
  comparison_operator       = "GreaterThanThreshold"
  evaluation_periods        = "2"
  metric_name               = "CPUUtilization"
  namespace                 = "AWS/EC2"
  period                    = "30"
  statistic                 = "Average"
  threshold                 = "75"
  alarm_actions             = [aws_autoscaling_policy.scale_out.arn]
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
}

resource "aws_cloudwatch_metric_alarm" "low_cpu" {
  alarm_name                = "low-cpu-utilization"
  comparison_operator       = "LessThanThreshold"
  evaluation_periods        = "2"
  metric_name               = "CPUUtilization"
  namespace                 = "AWS/EC2"
  period                    = "30"
  statistic                 = "Average"
  threshold                 = "20"
  alarm_actions             = [aws_autoscaling_policy.scale_in.arn]
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
}

In this last line

dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }

We are basically telling the alarm to monitor the instances in this specific ASG.

Notice that we specified alarm_actions here to specific Auto Scaling Policies:

alarm_actions             = [aws_autoscaling_policy.scale_in.arn]

and here

alarm_actions             = [aws_autoscaling_policy.scale_out.arn]

These policies will now be created below, and are triggered when their associated CloudWatch Alarm is triggered.

# Auto Scaling Policies
resource "aws_autoscaling_policy" "scale_out" {
  name                   = "scale_out"
  scaling_adjustment     = 1
  adjustment_type        = "ChangeInCapacity"
  cooldown               = 30
  autoscaling_group_name = aws_autoscaling_group.web.name
}

resource "aws_autoscaling_policy" "scale_in" {
  name                   = "scale_in"
  scaling_adjustment     = -1
  adjustment_type        = "ChangeInCapacity"
  cooldown               = 30
  autoscaling_group_name = aws_autoscaling_group.web.name
}

Launching

To launch we perform:

terraform init
terraform plan
terraform apply

Now checking the VPC, we see that it has the public and private subnets with the route tables.

Navigating to EC2, we see that the ASG is correctly configured

And an EC2 instance is live

Testing

I edited the EC2 user data script to install "stress" so once the instance, I can test the ASG automatically by driving up the CPU usage for a minute, and then stopping.

# EC2 User Data Script
data "template_file" "userdata" {
  template = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y epel-release
              yum install -y stress
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd
              systemctl enable amazon-ssm-agent
              systemctl start amazon-ssm-agent
              echo "Hello World from $(hostname -f)" > /var/www/html/index.html
              # Run stress for 1 minute to simulate high CPU usage
              stress --cpu 1 --timeout 60
            EOF
}

Another way to do this, is to SSH directly into your EC2 instance. To do this, we would have to make sure the instances have access to the internet from the private subnets.

# Elastic IP for NAT Gateway
resource "aws_eip" "nat_eip" {
  vpc = true
}

# NAT Gateway in Public Subnet
resource "aws_nat_gateway" "nat_gw" {
  allocation_id = aws_eip.nat_eip.id
  subnet_id     = aws_subnet.public_1.id
}

# Route Table for Private Subnets
resource "aws_route_table" "private" {
  vpc_id = aws_vpc.example.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_nat_gateway.nat_gw.id
  }
}

# Route Table Associations for Private Subnets
resource "aws_route_table_association" "private_1" {
  subnet_id      = aws_subnet.private_1.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "private_2" {
  subnet_id      = aws_subnet.private_2.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "private_3" {
  subnet_id      = aws_subnet.private_3.id
  route_table_id = aws_route_table.private.id
}

By adding a NAT Gateway and updating the route table for the private subnets, we enable instances in the private subnets to access the internet for outbound traffic while remaining protected from inbound internet traffic.

Now running the terraform apply will update our resources.

Monitoring the CloudWatch Alarms, we see that the CPU usage shoots up right away, triggering the "high_cpu_utilization" alarm because of the script we assign the EC2 instances

And here we see that a second EC2 instance is created by the ASG

Once the stress command is timed-out after 300 seconds, the CPU usage drops down below 20% and triggers the "low_cpu_utilization" alarm

And then the ASG terminates the us-east-1c EC2 instance, leaving only the instance in us-east-1a

And that's it for this project! We were able to successfully use Terraform to create an entire AWS Auto-Scaling Web Server architecture and test it ourselves.

Here is the Github repo if you want to try it out for yourself.

Note One thing I wasn't able to do yet, was ssh into the EC2 instances to manually test them, but I kept getting timed out.This is why I scripted the instances to run "stress" automatically on their creation.