etcd-io / raft

Raft library for maintaining a replicated state machine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RFC: Message level mechanism for disabling proposal forwarding

mitake opened this issue · comments

The Raft library provides the parameter DisableProposalForwarding. When the flag is true, it doesn’t forward MsgProp type messages from follower to leader. I guess providing a similar option for each Message might be valuable. It’s great if I can get comments on this idea.

Context: a program which uses the Raft library can have asynchronous processes which can issue Raft request from a server process. In the case of etcd, lease and compaction are typical examples. I guess other users of the library might have similar processes.

Such an asynchronous process can be implemented a goroutine whose behavior depends a condition that its node is leader or not. If the node is a leader, the goroutine issues Raft messages asynchronously (and periodically).

This behavior might be problematic in some cases. If the goroutine can be paused by various reasons (high load on CPU, disk I/O, etc), the node can be a follower by a new leader election. The problem is that the goroutine can behave based on a stale information that the node is still a leader. In such a case, the goroutine can issue Raft requests because it thinks that it’s still a leader. If the messages shouldn’t be duplicated, it might be harmful. Otherwise it will be problematic. In the case of etcd, it can cause lease revoking from a stale leader: etcd-io/etcd#15247

Note that this problem should happen only if the Raft library logic recognizes itself as a follower:

  • If the Raft library logic recognizes itself as a leader: MsgProp messages will be handled by the stale node itself and result MsgApp. In this case other nodes can reject the message because these messages have an old term.
  • If the Raft library logic recognizes itself as a follower already: MsgProp will be forwarded to a new leader and the new leader will send MsgApp. Although the original source of the messages is the stale leader, the new cluster can accept the messages.

(the above behavior is quite subtle and I'm still checking it, I'm glad if I can get feedback on it too)

I guess setting DisableProposalForwarding true will be a simple solution for avoiding this situation. However, if a program which uses the Raft library doesn’t provide a client side mechanism of selecting a leader and sending messages to it (e.g. etcd clientv3), the parameter will make the program not functional because it affects all messages. So I think it’s nice if the Raft library can provide a mechanism to disable proposal forwarding only for specific message types.

There might be some possible approaches:

  • Adding a new flag to Message: it’s simple but be too large change.
  • Adding a callback mechanism for judging a message should avoid proposal forwarding if a node is follower or not. In the case of etcd, etcd can supply a callback which drops lease or compaction related Raft requests.

I’d like to know other people’s opinions and how other programs which use the Raft library deal with this kind of issue.

cc @ahrtr @serathius @tbg @pavelkalinnikov