zalando-stups / senza

Deploy immutable application stacks and create and execute AWS CloudFormation templates in a sane way

Home Page:https://pypi.python.org/pypi/stups-senza

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add WaitCondition for the Elastigroup custom resource

lmineiro opened this issue · comments

commented

Native Auto Scaling Groups use a combination of CreationPolicy and the cfn-signal helper script to notify the CloudFormation stack of the success provisioning the number of required instances in the group.

Elastigroups are a custom resource and the CreationPolicy doesn't apply to them (Even if such parameter is currently missing from the Senza::Elastigroup component). For a custom resource such as Elastigroups Senza should use a WaitCondition, together with WaitConditionHandle enabling the same functionality as native Auto Scaling Groups.

Example:
{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Test cfn-signal for Elastigroup",
  "Resources": {
    "ExampleElastigroup": {
      "Properties": {
        "ServiceToken": "arn:aws:lambda:eu-central-1:178579023202:function:spotinst-cloudformation",
        "accessToken": "XXXXXX",
        "accountId": "act-XXXXXX",
        "group": {
          "capacity": {
            "maximum": 1,
            "minimum": 1,
            "target": 1
          },
          "compute": {
            "instanceTypes": {
              "ondemand": "t3.medium",
              "spot": [
                "t3.medium"
              ]
            },
            "launchSpecification": {
              "imageId": "ami-XXXXXXX",
              "monitoring": false,
              "ebsOptimized": false,
              "keyPair": "XXXXXX",
              "securityGroupIds": [
                "sg-XXXXXX"
              ],
              "tags": [
                {
                  "tagKey": "myTag",
                  "tagValue": "myKey"
                }
              ],
              "userData": {
                "Fn::Base64": {
                  "Fn::Join": [
                    "",
                    [
                      "#!/bin/bash -xe\n",
                      "yum install -y aws-cfn-bootstrap\n",
                      "/opt/aws/bin/cfn-signal '",
                      {
                        "Ref": "WaitHandle"
                      },                     
                      "'\n"
                    ]
                  ]
                }
              }
            },
            "product": "Linux/UNIX",
            "availabilityZones": [
              {
                "name": "eu-central-1b",
                "subnetIds": [
                  "subnet-XXXXXXX"
                ]
              }
            ]
          },
          "name": "Example-Elastigroup",
          "strategy": {
            "risk": 100,
            "availabilityVsCost": "balanced",
            "drainingTimeout": 120,
            "fallbackToOd": true,
            "lifetimePeriod": "days",
            "persistence": {},
            "revertToSpot": {
              "performAt": "always"
            }
          }
        }
      },
      "Type": "Custom::elastigroup"
    },
    "WaitHandle": {
      "Type": "AWS::CloudFormation::WaitConditionHandle"
    },
    "WaitCondition": {
      "Type": "AWS::CloudFormation::WaitCondition",
      "DependsOn": "ExampleElastigroup",
      "Properties": {
        "Handle": {
          "Ref": "WaitHandle"
        },
        "Timeout": "300",
        "Count": "1"
      }
    }
  }
}

This change will require changes in Taupage, specifically in the init.sh script where the signaling is done currently.

This is believed to be the reason why Stacks with a Senza::Elastigroup component reach the CREATE_COMPLETE status before they have healthy instances ready to receive traffic, leading to some workaround before traffic switching.