Phi Failure Detector
This is a C# implementation to Phi Accrual Failure Detector (PDF version here)
Getting started
The interval time unit is 100-nanosecond because it use ToFileTimeUtc
generate timestamps.
I’ll introduce a Clock
abstraction to eliminate this limitation in the future.
We use a very large initial interval since the "right" average depends on the cluster size and it’s better to err high (false negatives, which will be corrected by waiting a bit longer) than low (false positives, which cause "flapping").
Choose
var failureDetector = new PhiFailureDetector(
capacity: 100, // Store at most 100 heartbeat points
initialHeartbeatInterval: 2000,
phiFunc: PhiFailureDetector.Exponential
);
communicationService[peerId].onHeartBeat += (ignored1, ignored2) {
failureDetector.Report();
};
communicationService.watch(peerId, () => failureDetector.Phi() > threshold);
Architecture overview
3 components:
-
Monitoring
-
Interpretation
-
Action
Monitoring
Arrival window. Usually used with a throttler.
Interpretation
The phi function should be made up according to the heartbeat distribution.
See CASSANDRA-2597 for more details.
Future
-
Introduce a
Clock
abstraction for timestamps generation. This would decouple the dependency to time interval accuracy.