Add WaitCondition for the Elastigroup custom resource
lmineiro opened this issue · comments
Native Auto Scaling Groups use a combination of CreationPolicy and the cfn-signal helper script to notify the CloudFormation stack of the success provisioning the number of required instances in the group.
Elastigroups are a custom resource and the CreationPolicy doesn't apply to them (Even if such parameter is currently missing from the Senza::Elastigroup
component). For a custom resource such as Elastigroups Senza should use a WaitCondition, together with WaitConditionHandle enabling the same functionality as native Auto Scaling Groups.
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Test cfn-signal for Elastigroup",
"Resources": {
"ExampleElastigroup": {
"Properties": {
"ServiceToken": "arn:aws:lambda:eu-central-1:178579023202:function:spotinst-cloudformation",
"accessToken": "XXXXXX",
"accountId": "act-XXXXXX",
"group": {
"capacity": {
"maximum": 1,
"minimum": 1,
"target": 1
},
"compute": {
"instanceTypes": {
"ondemand": "t3.medium",
"spot": [
"t3.medium"
]
},
"launchSpecification": {
"imageId": "ami-XXXXXXX",
"monitoring": false,
"ebsOptimized": false,
"keyPair": "XXXXXX",
"securityGroupIds": [
"sg-XXXXXX"
],
"tags": [
{
"tagKey": "myTag",
"tagValue": "myKey"
}
],
"userData": {
"Fn::Base64": {
"Fn::Join": [
"",
[
"#!/bin/bash -xe\n",
"yum install -y aws-cfn-bootstrap\n",
"/opt/aws/bin/cfn-signal '",
{
"Ref": "WaitHandle"
},
"'\n"
]
]
}
}
},
"product": "Linux/UNIX",
"availabilityZones": [
{
"name": "eu-central-1b",
"subnetIds": [
"subnet-XXXXXXX"
]
}
]
},
"name": "Example-Elastigroup",
"strategy": {
"risk": 100,
"availabilityVsCost": "balanced",
"drainingTimeout": 120,
"fallbackToOd": true,
"lifetimePeriod": "days",
"persistence": {},
"revertToSpot": {
"performAt": "always"
}
}
}
},
"Type": "Custom::elastigroup"
},
"WaitHandle": {
"Type": "AWS::CloudFormation::WaitConditionHandle"
},
"WaitCondition": {
"Type": "AWS::CloudFormation::WaitCondition",
"DependsOn": "ExampleElastigroup",
"Properties": {
"Handle": {
"Ref": "WaitHandle"
},
"Timeout": "300",
"Count": "1"
}
}
}
}
This change will require changes in Taupage, specifically in the init.sh script where the signaling is done currently.
This is believed to be the reason why Stacks with a Senza::Elastigroup
component reach the CREATE_COMPLETE
status before they have healthy instances ready to receive traffic, leading to some workaround before traffic switching.