Killing cluster behaves unexpectedly

Question

Killing cluster behaves unexpectedly

fedyiv opened this issue 6 years ago · comments

Steps to reproduce:

Start cluster
./ops-cli -i ~/.ssh/Blockchain-controller.pem -c single_sample
Kill cluster
./ops-cli -i ~/.ssh/Blockchain-controller.pem -c single_sample -k
Start cluster again
./ops-cli -i ~/.ssh/Blockchain-controller.pem -c single_sample

Expected result:
Cluster starts normally
Actual result:
Cluster fails to start
Root cause
When cluster is starting it checks whether it was already started or not by checking availablility of AWS hosts. When we execute kill and hosts are in progress of termination, it thinks that cluster is alive and this leads to errors.
Workaround
After killing cluster, open AWS EC2 website and manually check that everything is down.
Proposed solution:
Enhance killing script so that it waits until actual cluster stop synchronously

Oleg Sesov · Answer 1 · Thu Feb 14 2019 21:36:42 GMT+0800 (China Standard Time)

I guess, the better way is to check host state on startup and wait for appropriate.
It would be noce to implement that is a reasonably uniform way - taking into account that eventually this might be run at least on: local hosts, docker hosts, aws ec2 hosts, openstack hosts, etc...