Aws Auto scaling divides into 2 categories: Fleet Management and Dynamic Scaling.
Fleet management
Used for:
- Replacing unhealthy instances;
- Distributing instances among availability-zones to maximize resilience; E.g: You're running instances in us-east, so auto-scaling can provision instances in the following AZs: us-east-1a, us-east-1b, us-east-1c, us-east-1d, and us-east-1e;
Dynamic scaling
Used for:
- Scaling based on cloudWatch alarm metrics or a metric type (more on that later) when a threshold is met or different measures should be taken depending on the breach of a cloudWatch alarm threshold.
Types of Dynamic scaling
- Simple scaling: Scales based on a single cloudwatch alarm metric, and apply the measures you define;
- Step scaling: Scales based on different levels of cloud watch alarm metrics, and apply the actions you define;
- Target tracking scaling: Scales based on a metric type, but delegates the action to be taken to AWS;
Which one to use?
That's not the right question to ask. Actually you'll be using Fleet Management out-of-the-box, with the possibility of configuring Dynamic Scaling to take some custom actions;
Let's go through some exemples:
To get to the auto-scaling configuration you should go to the
EC2
dashboard and findAuto Scaling Groups
in the sidebar. Select one auto-scaling group and find in the tabs below the auto-scaling groups listing the one calledScaling Policies
Fleet management: An application running on an EC2 instance stops responding health check, then auto-scaling stops routing traffic to it and moves that instance to quarantine to be analyzed, and spins up another instance one to replace it;
Dynamic scaling - Simple scaling: You have a cloudWatch alarm that monitors EC2 instances for cpu utilization and fires an alarm whenever it goes beyond 80% for 300 seconds (5 min). Your simple scaling policy defines the action to be taken is to spin up 1 more instance.
Dynamic scaling - Step scaling:
You have a cloudWatch alarm that monitors EC2 instances for cpu utilization and fires an alarm whenever it goes beyond 50% for 300 seconds (5 min). Your step scaling policy defines the action to be taken is to:
- Spin up 1 more instance when cpu utilization is <= 50% and < 60%;
- Spin up 3 more instances when cpu utilization is <= 60% and < 70%;
- Spin up 5 more instances when cpu utilization is <= 70% and < infinity;
Keep in mind that these instances will add up, so if your cpu utilization goes progressively until 70% you'll end up having 9 EC2 instances;
Target tracking scaling:
You want to keep the cpu utilization of your fleet at 50%, but let AWS handle how many instances should be launched or terminated in order to keep that metric.
Good to know: Aws runs algorithms and defines how to best take actions to scale out/in your EC2 instances based on the receiving demand.
Metric types:
- Application Load Balancer Request Count Per Target;
- Average CPU Utilization;
- Average Network In (Bytes);
- Average Network Out (Bytes);
Let me know if it was all clear in the comments below or with a reaction.
References:
- https://aws.amazon.com/solutions/server-fleet-management-at-scale/
- https://aws.amazon.com/blogs/compute/fleet-management-made-easy-with-auto-scaling
- https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scale-based-on-demand.html
- https://www.youtube.com/watch?v=WUUbOQyrnJU
- https://www.youtube.com/watch?v=srofVz6xvkE
Top comments (0)