apache / incubator-horaedb-meta

Meta service of HoraeDB cluster.

Home Page:https://horaedb.apache.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ensure the consistency between CeresDB nodes with metadata

ZuLiangWang opened this issue · comments

Description
We implemented basic dynamic cluster mode in version 0.4. However, the consistency and correctness of the cluster cannot be guaranteed. We need a solution that ensures that clusters are consistent even in extreme situations, so we decided to adapt the CeresMeta implementation according to the following principles:

  • Procedure for the same cluster is executed strictly serially and no concurrency is allowed.
  • When a procedure is running, it is not allowed to create a new procedure.
  • Before procedure running, must ensure shards version in metadata is the same as shards version in real nodes.
  • If procedure running failed, it will not be rollback and no more new procedure can be submit before cluster state is reset to stable by manual.

Proposal
Refactor the procedure module according to the above principles, it contains following changes:

  1. ProcedureManager needs to ensure that only one procedure can run at any one time.
  2. ProcedureFactory cannot create a new procedure while a procedure is running.
  3. Every Procedure should compare shard version in metadata and nodes, refused to running when they are not equal.
  4. When a Procedure is running failed, ProcedureManager cannot submit new procedure until the failed procedure is canceled by manual.

Additional context