open-cluster-management-io / addon-framework

addon apis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add ability to stop updating agent components temporarily

JustinKuli opened this issue · comments

Currently, any modifications to the ManifestWork will be reverted by the addon controller. So, the only modifications that can be made to addon agents on managed clusters are the ones specifically allowed by the addon.

In development or troubleshooting scenarios, it might be helpful to deploy a change to one addon on a cluster without updating the addon controller. For example, to add another argument to an addon agent container in order to enable a feature flag.

In stolostron, several controllers have "pause" annotations that can be placed on their resources, which prevent the controller from reconciling them. Something similar here could be useful.

We were able to implement this in our addon controller by overriding the Manifests function from the HelmAgentAddon, and returning an empty slice. See this commit for the example.

But now I'm worried that the empty slice is relying on the current implementation... specifically: https://github.com/open-cluster-management-io/addon-framework/blob/main/pkg/addonmanager/controllers/agentdeploy/controller.go. So right now, an empty slice results in the ManfestWork getting ignored, but it would also make sense if an empty slice resulted in deleting the existing ManifestWork.

Having a consistent way of doing this for all addons is appealing so we can define a consistent interface before 2.5 is released and support gets trained to use different pause annotations per addon.

hrm, that is interesting. agree we should have a consistent way.

/kind feature

It seems that this workaround no longer works due to this change:
0020a78#diff-b47b9c6da1d283ccb32074dd2dc9571541f1cc4cbcbbd7d6679f7f0acba17217R161

I'm thinking about a change to https://github.com/open-cluster-management-io/addon-framework/blob/main/pkg/addonmanager/controllers/agentdeploy/controller.go#L147-L167

So that a nil Manifests could be handled differently to an initialized list with length 0. Then nil could function as the pause, like it previously did.

Opinions?