Auto Scaling Groups

Date post: 15-Jul-2015
@pas256 https://cloudnative.io/ Auto Scaling Groups Advanced AWS meetup Peter Sankauskas Founder of CloudNative @pas256
Auto Scaling GroupsAdvanced AWS meetup




Peter Sankauskas Founder of CloudNative


Daily lifeMore users

Higher costsMore logsMore data

New engineers

More instances

Increased deployment frequency

Reduce costs Eliminate deployment risksBoss


Your GoalSleep

ReliableSocial life


UptimeTime with family


– PagerDuty

“Don’t hate the Pager, hate the game”

Old world



s ru









Used Capacity

70% Wasted

Auto Scaling Group

• Your assistant in the cloud

• First level support

• Automation







Used Capacity

Auto Scaling Group• Capacity: minimum, maximum, desired

• Access: ELB

• Polices

• Where:

• Availability Zones

• VPC Subnets

ASG Launch Config

Scaling PolicyScaling Policy

Scaling PolicyScheduled


Scheduled Action

Scheduled Action

{! "Type" : "AWS::AutoScaling::AutoScalingGroup",! "Properties" : {! "AvailabilityZones": [ String, ... ],! "Cooldown": String,! "DesiredCapacity": String,! "HealthCheckGracePeriod": Integer,! "HealthCheckType": String,! “LaunchConfigurationName": String,! "LoadBalancerNames": [ String, ... ],! "MaxSize": String,! "MetricsCollection": [ MetricsCollection, ... ]! "MinSize": String,! “NotificationConfiguration": NotificationConfiguration,! "PlacementGroup": String,! "Tags": [ Auto Scaling Tag, ... ],! “TerminationPolicies": [ String, ... ],! "VPCZoneIdentifier": [ String, ... ]! }!}

Launch Configuration• Every ASG needs a Launch Configuration

• Describes what an individual EC2 instance looks like


• Instance type

• Security groups

{! "Type" : "AWS::AutoScaling::LaunchConfiguration",! "Properties" : {! "AssociatePublicIpAddress": Boolean,! "BlockDeviceMappings": [ BlockDeviceMapping, ... ],! "EbsOptimized": Boolean,! "IamInstanceProfile": String,! "ImageId": String,! "InstanceMonitoring": Boolean,! "InstanceType": String,! "KernelId": String,! "KeyName": String,! "RamDiskId": String,! "SecurityGroups": [ SecurityGroup, ... ],! "SpotPrice": String,! "UserData": String! }!}

Scaling Plans

1. Fixed

2. Manual

3. Scheduled

4. Dynamic

Fixed• Ensure a fixed number of instances is always running

• Set MinSize = MaxSize

• Examples

• Any “master” service

• Zookeeper - 3 nodes across 3 AZs

• Cassandra0




Used Capacity

# One Asgard instance - troposphere example!launchConfig = t.add_resource(asg.LaunchConfiguration("launchConf",! AssociatePublicIpAddress=True,! IamInstanceProfile=Ref(asgardInstanceProfile),! ImageId=FindInMap("AWSRegion2AMI", Ref("AWS::Region"), "AMI"),! InstanceType="m3.medium",! KeyName="admin",! SecurityGroups=[Ref(asgardInstanceSecurityGroup)],!))!!

asgardASG = t.add_resource(asg.AutoScalingGroup("asgardASG",! Tags=[asg.Tag("Name", "Asgard", True)],! Cooldown="120",! MinSize="1",! MaxSize="1",! AvailabilityZones=["us-west-2a","us-west-2b"],! VPCZoneIdentifier=["subnet-c46c6982","subnet-8133f6e4"],! LaunchConfigurationName=Ref(asgardLaunchConfig),!))

Manual Scaling

• Use API to change capacity on demand


• AutoScalingGroupName = my-asg

• DesiredCapacity = 20



Used Capacity0



• At this time, set capacity to X

• Each ScheduledAction must have a unique start time

• Guaranteed order of execution within same ASG







Used Capacity

Specific date and timePutScheduledUpdateGroupAction!

• ScheduledActionName = ScaleOut

• AutoScalingGroupName = my-asg

• DesiredCapacity = 3

• StartTime = “2013-05-12T08:00:00Z”

Recurring schedulePutScheduledUpdateGroupAction!

• ScheduledActionName = Scaleout-schedule-year

• AutoScalingGroupName = my-asg

• DesiredCapacity = 3

• Recurrence = “30 0 1 1,6,12 0”

Dynamic Scaling

• Best Utilization

• Lowest Cost







Used Capacity

Trigger: CloudWatch Alarm• Metrics

• CPU Utilization

• Network in/out

• Size of queue (SQS)

• Anything you put into CloudWatch

• Set the Alarm Action to the ARN of the ScalingPolicy

Action: ScalingPolicy• Adjustment Types

• Change by number

• E.g. Scale Out: Add 2 more instances

• E.g. Scale In: Remove 1 instances

• Exact

• E.g. Scale Out: Have exactly 8 instances

• Percentage

• E.g. Scale Out: Add 25% more instances

• After a ScalingPolicy has been fired, wait X seconds before performing any other actions.

• Manual Scaling: SetDesiredCapacity

• HonorCoolDown = True/False

Load Balancing

• Put an ELB in front of the instance in your ASG

• Set when creating the ASG

• Zero effort in adding and removing instances

• Additional health check options

Health Checks• By default, ASG uses EC2 Status Checks

• If you have an ELB, you can use the same ELB health checks

• HTTP:80/healthcheck!

• HTTP 200 response is the only thing that is considered healthy

• E.g. Return something else while app is loading filled

Termination Policy

• OldestInstance

• NewestInstance

• OldestLaunchConfiguration

• ClosestToNextInstanceHour

Requirements for Dynamic Scaling• Stateless application

• Configuration must be 100% automated

• Tools understand dynamic environments

• Config management

• Monitoring

• Log aggregation

• Create an ASG or LaunchConfiguration from an already running instance

• Put that instance in the ASG

{! "Type" : "AWS::AutoScaling::AutoScalingGroup",! "Properties" : {! "AvailabilityZones" : [ String, ... ],! "Cooldown" : String,! "DesiredCapacity" : String,! "HealthCheckGracePeriod" : Integer,! "HealthCheckType" : String,! "InstanceId" : String,! "LaunchConfigurationName" : String,! "LoadBalancerNames" : [ String, ... ],! "MaxSize" : String,! "MetricsCollection" : [ MetricsCollection, ... ]! "MinSize" : String,! "NotificationConfiguration" : NotificationConfiguration,! "PlacementGroup" : String,! "Tags" : [ Auto Scaling Tag, ... ],! "TerminationPolicies" : [ String, ... ],! "VPCZoneIdentifier" : [ String, ... ]! }!}

{! "Type" : "AWS::AutoScaling::LaunchConfiguration",! "Properties" : {! "AssociatePublicIpAddress" : Boolean,! "BlockDeviceMappings" : [ BlockDeviceMapping, ... ],! "EbsOptimized" : Boolean,! "IamInstanceProfile" : String,! "ImageId" : String,! "InstanceId" : String,! "InstanceMonitoring" : Boolean,! "InstanceType" : String,! "KernelId" : String,! "KeyName" : String,! "RamDiskId" : String,! "SecurityGroups" : [ SecurityGroup, ... ],! "SpotPrice" : String,! "UserData" : String! }!}

# Instance Configuration - Self healing NAT - troposphere!natLaunchConfig = t.add_resource(asg.LaunchConfiguration(! "natLaunchConfig",! AssociatePublicIpAddress=True,! InstanceType="t1.micro",! ImageId="ami-f032acc0",! SecurityGroups=[Ref(natSecurityGroup)],! IamInstanceProfile=Ref(natInstanceProfile),! UserData=Base64(Join("\n", [! "#!/bin/bash",! "yum update -y",! "instanceId=`/opt/aws/bin/ec2-metadata -i | cut -f2 -d' '`",! "region=`/opt/aws/bin/ec2-metadata -z | cut -f2 -d' ' | sed '$s/.$//'`",! "vpcId=`aws ec2 describe-instances --instance-ids $instanceId --region $region --query 'Reservations[*].Instances[*].VpcId' --output text`",! """rtbId=`aws ec2 describe-route-tables --region $region --filters "[{\\"Name\\":\\"vpc-id\\",\\"Values\\":[\\"$vpcId\\"]},{\\"Name\\":\\"association.main\\",\\"Values\\":[\\"true\\"]}]" --query RouteTables[*].RouteTableId --output text`""",! """aws ec2 modify-instance-attribute --instance-id $instanceId --source-dest-check '{"Value": false}' --region $region --output table""",! "aws ec2 replace-route --route-table-id $rtbId --destination-cidr-block --instance-id $instanceId --region $region --output table",! "aws ec2 create-route --route-table-id $rtbId --destination-cidr-block --instance-id $instanceId --region $region --output table"! ]))!))

UserData and cloud-init• Inside LaunchConfiguration

• Set UserData script to be run by cloud-init

• If you are using Chef, this is what you will do

• More details:

• Watch Episode #4 on Answers for AWS

Baking AMIs• Raw: Do everything on boot

• Fully Baked: Immutable infrastructure

• Half-Baked: Anything in-between



Deploy Changes• Option 1: Change AMI or User Data in LaunchConfiguration

• NOTE: This has no immediate outcome

• Only affects newly launched instances

• Revisit TerminatePolicy

• You need to terminate existing instances so that new ones come up with the changes

Deploy Changes• Option 2: Create a completely new stack

• Use CloudFormation (or whatever) to create a new ASG, LaunchConfig, ScalingPolicies, ELB, Security Group, VPC, Subnets, etc

• Overkill

• If you have high traffic, the new ELB will not be pre-scaled and will not handle the load

• Need to contact AWS TAM

Blue/Green DeploymentOr is a red/black deployment… or is it A/B deployment?

• Option 3:

• Reuse existing infrastructure including the same ELB

• Create a new ASG and LaunchConfig

• Switch traffic at the ELB from old ASG to new ASG

Page 37: Auto Scaling Groups



– Peter Sankauskas… just now

“It’s not about how fast you can deploy, it is about how fast you can rollback”

Canary Deployment• Very similar to blue/green deployment

• New ASG and LaunchConfig

• Add traffic to only 1 instance in the new ASG

• Then 2 instance

• Up to 100%

• Both versions running side by side

• Roll off traffic from old ASG instances

Running multiple version• DB Schema changes are on a different schedule to code


• mcfunley (Etsy): “We deploy schema changes once per week. The code always works against both versions of the schema. We never take downtime for schema changes. We avoid data loss by doing soft deletes as much as we can.”

• Deploy features dark

• Use Feature Flags

Tools• Baking AMIs

• Packer - Hashicorp

• Aminator - Netflix

• CloudNative

• Deployment

• Asgard - Netflix

• CloudNative

New World• Automation expert

• Stateless, independently scalable apps

• Allergic to manual labor

• Embrace your laziness

• Auto Scaling Groups provide:

• Zero-effort scaling

• Fault-tolerance

• Increase reliability & uptime

• Decrease cost

