dragonflydb / dragonfly-operator

A Kubernetes operator to install and manage Dragonfly instances.

Home Page:https://www.dragonflydb.io/docs/managing-dragonfly/operator/installation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sentinel Failover Replica Choice

Niennienzz opened this issue · comments

I am not sure if the operator is already implemented this way, but below is how Sentinel decides which replica to choose when performing a failover. If not, maybe the operator can use a similar logic to pick the most desired replica instance.

Step-1 Use the replica with the lowest replica-priority

  • As documented here, Sentinal prioritizes the replica-priority value when choosing a replica for failover.
  • This value is returned by the Redis INFO command as slave_priority.
  • If all replicas have the same replica-priority value, go to Step-2.

Step-2 Use the replica with the highest slave_repl_offset.

  • The slave_repl_offset value, as returned by the INFO command, reports the replication offset of the replica instance.
  • A replica instance with a higher slave_repl_offset value means that it is the closest to the primary instance in terms of replication, thus more suitable for a promotion.
  • If more than one replica instances have the same slave_repl_offset value, go to Step-3.

Step-3 Use the lowest run_id value.

  • Also returned by the INFO command.
  • Sentinel defaults to the replica instance with the lowest run_id value as the last resort.

There's a network screening process before this whole candidate selection process, I will update the description once I have a grasp on that.

Now that Dragonfly has the offset, It totally makes sense to use it and be intelligent about the choice! This should also be an easy fix considering all the parts are already there.

I would like to take a stab at this

Thanks @nujragan93 🙇 assigned to you.

Which version of dragonflyDb gives you slave_priority, slave_repl_offset and run_id when running INFO command, I dont see with v1.16.0

# Replication
role:replica
master_host:10.1xx
master_port:9999
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
master_replid:1604ae320e0dadcbd2bcb200030f195058e608e0