stakater / Reloader

A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig – [✩Star] if you're using it!

Home Page:https://docs.stakater.com/reloader/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] Not all Pods get restartet after Secret Change

teimyBr opened this issue · comments

commented

Describe the bug
We have like 20 Deployment in our cluster.
all have on the Deployment the Annotations: reloader.stakater.com/auto: true

This all are AKHQ Deployment with Kafka Secrets.

Every 5 Days the Secrets get changed. (at the same time)

Sometimes:
Only 17/18 get restartet by reloader.

level=info msg="Changes detected in 'root-ca-cert-truststore' of type 'SECRET' in namespace 'test1', Updated 'akhq' of type 'Deployment' in namespace 'test1'"

Log look like this without any error.

To Reproduce
Who can i debug this deeper ?

Expected behavior
All pods get restartet

Environment

  • Operator Version: v1.0.108
  • Kubernetes/OpenShift Version: 1.28.6 Plain Vanilla Kubernetes
commented

More: He logs level=info msg="Changes detected in 'root-ca-cert-truststore' of type 'SECRET' in namespace 'test1', Updated 'akhq' of type 'Deployment' in namespace 'test1'"

But sometimes the pod doesnt restartet

so 17 or 18 have restartet and 1-2 NOT

But the Log said he restartet all 20.

There is no error log that something has failed.

Is the restarting done in a fire-and-request approach and if the API server has issues, they are lost or is there some ACK/retry involved?

The pods which are not restarted, are they the same ones everytime or random?

commented

Very Random. We watch this over the last 3-4 weeks. sometime this is deployment 17 then next time deployment 3.
Never ever the same.

Tried it also with latest version there is it still there

Facing the same issue.
Sometimes not all deployments gets rolled out. We have 56 deployments in total.

any more information about what values are being used to install Reloader?
and are all deployments backed by any CD tool, if yes, is there a possibility of CD tool and Reloader clashing in updating Deployments?

+1 Facing the same issue.

commented

We are using reloader helm chart.

Chart.yaml

apiVersion: v2
name: reloader
version: 0.0.0
dependencies:
- name: reloader
  version: 1.0.114
  repository: https://stakater.github.io/stakater-charts

values.yaml

reloader:
  reloader:
    deployment:
      resources:
        limits:
          cpu: 500m
          memory: 256Mi
        requests:
          cpu: 10m
          memory: 128Mi

I will try to replicate the load and the issue. Meanwhile, can you guys tell how the apps are being deployed? Is there any CD tool in picture?

commented

Yes we are using argo cd to install the reloader helm chart

I was asking about the applications which are reloaded 😃 Assuming they are, since you are using argocd anyways.

Have you tried switching the reloadStrategy to annotations? Ref: https://github.com/stakater/Reloader/blob/master/README.md#reload-strategies

commented

reloadStrategy: default # Set to default, env-vars or annotations

You mean to set this from default to annotations.

Whats exactly is the differenz ?

On the akhq pods we are already using

Annotations:
  reloader.stakater.com/auto: true 

And this working, but sometimes random not all get rolled. In my option this could be kubernetes client problem how the restart is performed in reloader bit i didnt find where this is done

You mean to set this from default to annotations.

yes

Whats exactly is the differenz ?

That, instead of updating env field of containers in deployments for reloading, it will update spec.template.annotations field to trigger reloading. This is normally better in case there's a CD tool in picture.
I am more inclined to believe that CD tools and Reloader are clashing.
The other point of requests getting lost by API Server is what I haven't seen till now in my limited experience. Because if that happens, most of the cluster won't work.

I will try to replicate the workload over our clusters and test this soon.

commented

Will also test out to set annotations.

If reloader has API Server Issues would this be logged and if yes which log level ?

If reloader has API Server Issues would this be logged and if yes which log level ?

Since you are getting proper update logs for all of the deployments, I'd like to believe that it's not an API Server issue. Because if it were, it should log errors if deployment state is not changed for some reason by Reloader.

commented

reloadStrategy: annotations fixed the problem

i would like a add a line in the documentation. When using argo to set this flag