Improve the cleaning mechanism of ShardNode

Question

Improve the cleaning mechanism of ShardNode

ZuLiangWang opened this issue 6 months ago · comments

Describe this problem
We found that the failover mechanism of the HoraeDB cluster failed, and the shard was not migrated when the machine went down.

Steps to reproduce

Make the etcd root path configuration in HoraeDB and HoraeMeta inconsistent.
Shut down a HoraeDB node.

Additional Information

Add drop ShardNode api to deal with some extreme situations.
Add a new way to detect failed nodes, not only relying on etcd's lease event.
1. Use a background thread to continuously detect failed nodes.
2. Detect failed nodes through heartbeat.