Tendrl / documentation

Project-wide documentation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Document the replace storage device workflow in tendrl

shtripat opened this issue · comments

If a storage device goes faulty (which is supposed to happen in real world), we need to clearly have workflows defined how a new device could be brought in the system and replaced for the faulty one

  • This needs to take care of movement of data new device coming up into picture
  • Then slowly phasing out the old faulty device
  • Finally bring down the faulty device and remove from the underlying cluster

This flow looks simpler but involves lot of technicalities in ceph and gluster and its a risky stuff and need to be done very carefully, so flow and steps involved should be well thought through and implemented.

@shtripat has this been handled in skyring?

@brainfunked No. Replace storage device flow was not implemented in skyring. But to track and to make sure we dont loose the importance we can keep this issue open till we implement.

Replace nothing but expand the cluster and then removing(shrink cluster) the faulty node. Its not going to be a single operation.