atomix / copycat

A novel implementation of the Raft consensus algorithm

Home Page:http://atomix.io/copycat

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Event index/sequence makes session stall without replying

thiagoss opened this issue · comments

I have experienced a scenario where a ServerSessionContext ends up skipping (not committing) a particular command and subsequent events/queries/responses are not processed because it depends on the skipped command. The command is skipped in ServerStateMachine::apply(CommandEntry) because of entry.getSequence() < session.nextCommandSequence().

Here is what happens:

  1. (not sure if relevant) The system is under high-load, this client sends a command and loses connection and retries connecting a few times.
Sending request: 19cf4a41-1513-446d-a56d-45954cee89e6 CommandRequest[session=8, sequence=235, command=InstanceCommand[resource=103, command=ResourceCommand...

This actually timesout and the client moves on and tries connecting to another server.

  1. Server applies this command:
/172.17.0.2:9876 - Received AppendRequest[term=2, leader=-1408161302, logIndex=632, logTerm=1, entries=[0], commitIndex=634, globalIndex=632]
/172.17.0.2:9876 - Applying CommandEntry[index=634, term=1, session=8, sequence=235, timestamp=1482433827616, command=InstanceCommand[resource=103, command=ResourceCommand...
/172.17.0.2:9876 - Sent AppendResponse[status=OK, term=2, succeeded=true, logIndex=634]

Note that the sequence=235. Now the session=8 has a commandSequence of 235. This was at index=634. This is at a Follower server, if that matters. And this was also right after a term change.

  1. Some time later, new commands are received:
/172.17.0.2:9876 - Received AppendRequest[term=2, leader=-1408161302, logIndex=737, logTerm=2, entries=[1], commitIndex=737, globalIndex=735]
/172.17.0.2:9876 - Appended CommandEntry[index=738, term=2, session=8, sequence=236, timestamp=1482433831887, command=InstanceCommand[resource=103, command=ResourceCommand[...
/172.17.0.2:9876 - Applying CommandEntry[index=737, term=2, session=8, sequence=235, timestamp=1482433831886, command=InstanceCommand[resource=103, command=ResourceCommand[...
/172.17.0.2:9876 - Sent AppendResponse[status=OK, term=2, succeeded=true, logIndex=738]

Note that the applied command has sequence=235 so this won't generate a commit into session=8 and this won't be published to the client.

  1. From now on, requests will start depending on entry=737 but it was skipped and things start to fail for the client associated with this session.

I'm not sure how to debug this further or what exactly is the issue here. Any tips are appreciated.

A summary of what happens:

  1. Client sends a CommandRequest with sequenceNumber=S
  2. Clients times out
  3. Server applies that command and tries to notify client but it is out
  4. Client recovers and sends the same CommandRequest with the same sequenceNumber
  5. Server applies that command but that doesn't generate a commit (the sequence number check makes it just return the same response again). The client won't be notified about this particular entry being committed.
  6. Subsequence queries/commands sent by the client won't be processed in its session context because they are blocked by the event index of the command sent in 4.

onos1.log.xz.zip

Search for: CommandRequest\[session=8, sequence=235
This is where it starts.

(Sorry it is a xz inside a zip)

onos2.log.xz.zip

Log for the second instance in the cluster, it was a follower and then became leader after the sequence=235 was received for the first time.

(Sorry again it is a xz inside a zip for size restrictions)

So, we spent a while exploring this together and managed to track down the issue. This is actually a major bug in the replication algorithm. I'm going to close this issue as it's not descriptive of the actually problem. I'll reopen a new issue with the bug report.