Improve the cleaning mechanism of ShardNode
ZuLiangWang opened this issue · comments
Describe this problem
We found that the failover mechanism of the HoraeDB cluster failed, and the shard was not migrated when the machine went down.
Steps to reproduce
- Make the etcd root path configuration in HoraeDB and HoraeMeta inconsistent.
- Shut down a HoraeDB node.
Additional Information
- Add drop
ShardNode
api to deal with some extreme situations. - Add a new way to detect failed nodes, not only relying on etcd's lease event.
- Use a background thread to continuously detect failed nodes.
- Detect failed nodes through heartbeat.