zalando-stups / senza

Deploy immutable application stacks and create and execute AWS CloudFormation templates in a sane way

Home Page:https://pypi.python.org/pypi/stups-senza

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Traffic switching by CF update does not work for multiple domains or custom resource names

hjacobs opened this issue · comments

As far as I can see #318 introduces a bunch of problems:

  • traffic switching is slow (waiting for AWS CloudFormation...)
  • traffic switching does not work with custom resource names ("AppLoadBalancerMainDomain" hardcoded)
  • traffic switching does not work with multiple load balancers / domain names (we have this scenario e.g. in our Plan B setup where we have different SSL certs and therefore different ELBs and DNS names)

We can't avoid the first point if we want to keep the stack and the record coherent (and AWS to delete the record when we delete the stack).

I'll fix the second point by finding the ELB by dns name instead of having the key hardcoded, and if I understood your setup correctly it should also fix the third point.

@jmcs I found another bug while testing/migrating:

  • create stack version cd406c1 and switch traffic to 100% using old Senza
  • create stack version cd408c1 and switch traffic to 100% using new Senza (CF update)

=> Result: both stacks have now traffic 100% (not what I wanted, i.e. old stack still gets traffic).

~/workspace/senza (find-main-lb) $ python3 -m senza traffic controller cd408c1 100
Calculating new weights.. OK
Stack Name│Version│Identifier        │Old Weight%│Delta │Compensation│New Weight%│Current
controller cd406c1 controller-cd406c1       100.0 -100.0                      0.0         
controller cd408c1 controller-cd408c1         0.0  100.0                    100.0 <       
Setting weights for controller.zmon.zalan.do... OK
~/workspace/senza (find-main-lb) $ python3 -m senza events controller
controller cd408c1 CloudFormation::Stack              controller-cd408c1           CREATE_COMPLETE                                                                            1m ago 
controller cd408c1 CloudFormation::Stack              controller-cd408c1           UPDATE_IN_PROGRESS                  User Initiated                                        55s ago 
controller cd408c1 Route53::RecordSet                 AppLoadBalancerMainDomain    UPDATE_IN_PROGRESS                                                                        47s ago 
controller cd408c1 Route53::RecordSet                 AppLoadBalancerMainDomain    UPDATE_COMPLETE                                                                           14s ago 
controller cd408c1 CloudFormation::Stack              controller-cd408c1           UPDATE_COMPLETE_CLEANUP_IN_PROGRESS                                                       12s ago 
controller cd408c1 CloudFormation::Stack              controller-cd408c1           UPDATE_COMPLETE                                                                           11s ago 
~/workspace/senza (find-main-lb) $ python3 -m senza traffic controller 
Stack Name│Version│Identifier        │Weight%
controller cd406c1 controller-cd406c1   100.0 
controller cd408c1 controller-cd408c1   100.0 

BTW, this behavior also cannot be fixed using Senza:

~/workspace/senza (find-main-lb) $ python3 -m senza traffic controller cd406c1 0
Calculating new weights.. OK
Stack Name│Version│Identifier        │Old Weight%│Delta │Compensation│New Weight%│Current
controller cd406c1 controller-cd406c1       100.0 -100.0                      0.0 <       
controller cd408c1 controller-cd408c1       100.0                           100.0         
Setting weights for controller.zmon.zalan.do... not changed
~/workspace/senza (find-main-lb) $ python3 -m senza traffic controller
Stack Name│Version│Identifier        │Weight%
controller cd406c1 controller-cd406c1   100.0 
controller cd408c1 controller-cd408c1   100.0 

The problem when migrating (traffic switching does not change weight on old stack) is still there.
This is a blocker for me as it prevents people from seamlessly migrating to a new Senza version.

Last thing to test for me: what happens if old DNS record (weight zero) was deleted?

A deleted DNS record will not be updated (as expected), i.e. the following will not work as expected:

  • deploy old stack version A with old Senza (no CF update)
  • deploy new stack version B with old Senza (no CF update)
  • redirect traffic to new stack version B with old Senza (no CF update, this will delete the DNS record for stack version A)
  • deploy another stack version C with new Senza (CF update)
  • redirect traffic to stack version C with new Senza (CF update, DNS record for A still does not exist)
  • now try to redirect traffic back to old stack version A with new Senza

=> stack version A will not get traffic as the new Senza does not create a new DNS record for stack version a (it only tries CF update and UPSERT on existing records)

IMHO we don't need to fix this as creating a new DNS record from scratch is too much extra code for this "corner" case.