How to minimize ecs autoscaling reaction time from terraform?
When you create an ECS autoscaling policy, two alarms tag along with it: one for scaling up ("out"), one for scaling down ("in").
The scale-out ones I see created appear to sample CPU utilization (or the metric of interest) every minute, and only trigger automatic scaling when three consecutive data points have breached the threshold.
This means that if I see a traffic spike, three minutes will pass before scale-out happens. (In fact, on average the threshold breach will happen in the middle of a sampling interval, so the delay is three and a half minutes.)
I can adjust the sampling rate and the number of data points required through the AWS console web interface.
However, I would like to manage my infrastructure through Terraform.
How can I use Terraform but no manual clickery to shorten the time between (a) the first breach of the threshold; and (b) the point in time at which I begin the scale-out? (Also: is this a dumb thing to attempt? Am I going about it in an awk-basscards way?)
As far as I can tell, it looks like ice skating uphill: creating autoscaling policies (which I can do through Terraform) automatically creates two alarms and returns handles to them (see https://docs.aws.amazon.com/autoscaling/application/APIReference/API_PutScalingPolicy.html) but Terraform doesn't expose those handles (see https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/appautoscaling_policy#attributes-reference). Is it still possible in Terraform? Does it require heroic efforts?
Solution 1:
You can definitely achieve this with Terraform. There are a few ways to achieve this but I will focus on the one that gives you more flexibility.
Suppose you already have your aws_autoscaling_group
resource defined, after that you need to define your scaling policies for your ASG and CloudWatch
alarms that will trigger them.
I usually track 3 different metrics for autoscaling: MemoryReservation, CPUReservation and CPUUtilization.
An example how to setup autoscaling based on CPUUtilization.
Scaling policies for our ASG:
resource "aws_autoscaling_policy" "my-cpu-scale-up" {
name = "my-cpu-scale-up"
scaling_adjustment = 1
adjustment_type = "ChangeInCapacity"
cooldown = 60
autoscaling_group_name = aws_autoscaling_group.[your-asg-resource].name
}
resource "aws_autoscaling_policy" "my-cpu-scale-down" {
name = "my-cpu-scale-down"
scaling_adjustment = -1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.[your-asg-resource].name
}
CloudWatch alarms that will trigger one of our policies.
resource "aws_cloudwatch_metric_alarm" "my-cpu-usage-high" {
alarm_name = "my-cpu-usage-high"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "60" // in seconds
statistic = "Average"
threshold = "70" // in %
alarm_description = "This metric monitors the cluster for high CPU usage"
alarm_actions = [
aws_autoscaling_policy.my-cpu-scale-up.arn
]
dimensions ={
AutoScalingGroupName= aws_autoscaling_group.[your-asg-resource].name
}
}
resource "aws_cloudwatch_metric_alarm" "my-cpu-usage-low" {
alarm_name = "my-cpu-usage-low"
comparison_operator = "LessThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "60"
statistic = "Average"
threshold = "20"
alarm_description = "This metric monitors my cluster for low CPU usage"
alarm_actions = [
aws_autoscaling_policy.my-cpu-scale-down.arn
]
dimensions ={
AutoScalingGroupName= aws_autoscaling_group.[your-asg-resource].name
}
}
As you can see from this example, we have can play around with alarms configuration until we achieve the desired result.
Hope that helps!