XenitAB / node-ttl

Enforces a time to live (TTL) on Kubernetes nodes and evicts nodes which have expired.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ignore nodes where drain is not possible

phillebaba opened this issue · comments

A node may be blocked from being drained when a PDB stops a Pod from being moved. This should be avoided in most cases but mistakes can happen. Currently Node TTL will just get stuck with that node. A better solution would be to ignore the nodes which will never be drained.

Hi! Note that the Rook operator handles PDBs dynamically, so that OSD pods (disks) aren't evicted faster than the storage cluster can handle.

Ah thats an interesting point did not know about this. So my understanding then is that it has PDB which prohibits any eviction and then modifies the PDB when it feels like it can handle the eviction? So would all nodes with OSD pods be stuck until Rook jumps in and modifies the PDB?

My plan was to implement a dry run on the drain which evaluates is a node would get stuck in the first place. Do you think this would work with Rook?

So would all nodes with OSD pods be stuck until Rook jumps in and modifies the PDB?

Yes, that's right!

My plan was to implement a dry run on the drain which evaluates is a node would get stuck in the first place. Do you think this would work with Rook?

I don't know about this actually, and I am unable to test it since I no longer use Rook

I would suggest implementing a labelSelector for this feature so that Rook nodes still can be drained (effectively ignoring this feature), while respecting others.