kubeshop / botkube

An app that helps you monitor your Kubernetes cluster, debug critical deployments & gives recommendations for standard practices

Home Page:https://botkube.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Increased resources consumption configuring some sources

bygui86 opened this issue ยท comments

Before you submit the issue

  • Search open and closed issues for duplicates.
  • Read the contributing guidelines (CONTRIBUTING.md file on root of the repository).
  • Ask in Slack channel helping-hands.

Description

We use Botkube mainly to receive alerts based on specific events.

Migrating from 0.18.0 to 1.8.0, we translated the configuration from old (0.18) to new (1.8) syntax, but we face some issues:

  • we see error logs same as described in the Slack thread https://botkube.slack.com/archives/C01CR1KS55K/p1708685516373879.
  • Botkube keeps restarts because of livenessProbe failing (sometimes because of OOMKilled during startup, but this doesnโ€™t occur that often), but we increased resources like: requests 1000m/1Gi, limits 3000m/3Gi. We set those values based on Grafana spikes we observed. To be honest they look really high, especially if compared to version 0.18! Actually it seems that the adoption of the plugin model increased exponentially the resources usage ๐Ÿ˜ž

Please note that the above mentioned issues happen also with the default global_config coming out of the helm-chart with following sources enabled: k8s-all-events, k8s-create-events, k8s-err-events, k8s-err-with-logs-events, k8s-recommendation-events

Expected behavior

Even with many configured sources, Botkube should consume up to 1000m cpu and 1Gi mem, like in older versions <= 1.0.0

Actual behavior

With many configured sources, Botkube consumes up to 3000m cpu and 3Gi mem.

Steps to reproduce

Just deploy Botkube using following configuration:

expand

global_config.yaml

executors:
  bins-management:
    botkube/exec:
      config:
        templates:
        - ref: github.com/kubeshop/botkube//cmd/executor/exec/templates?ref=v1.8.0
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
      displayName: Exec
      enabled: false
  k8s-default-tools:
    botkube/helm:
      config:
        defaultNamespace: default
        helmCacheDir: /tmp/helm/.cache
        helmConfigDir: /tmp/helm/
        helmDriver: secret
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
      displayName: Helm
      enabled: false
    botkube/kubectl:
      config:
        defaultNamespace: default
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
      displayName: Kubectl
      enabled: true

aliases:
  k:
    command: kubectl
    displayName: Kubectl alias
  kc:
    command: kubectl
    displayName: Kubectl alias
  x:
    command: exec
    displayName: Exec alias

actions:
  describe-created-resource:
    bindings:
      executors:
      - k8s-default-tools
      sources:
      - k8s-create-events
    command: kubectl describe {{ .Event.Kind | lower }}{{ if .Event.Namespace }} -n
      {{ .Event.Namespace }}{{ end }} {{ .Event.Name }}
    displayName: Describe created resource
    enabled: false
  show-logs-on-error:
    bindings:
      executors:
      - k8s-default-tools
      sources:
      - k8s-err-with-logs-events
    command: kubectl logs {{ .Event.Kind | lower }}/{{ .Event.Name }} -n {{ .Event.Namespace
      }}
    displayName: Show logs on error
    enabled: false

settings:
  clusterName: ceiba-cicd
  healthPort: 2114
  lifecycleServer:
    enabled: true
    port: 2113
  log:
    disableColors: false
    formatter: json
    level: info
  persistentConfig:
    runtime:
      configMap:
        annotations: {}
        name: botkube-runtime-config
      fileName: _runtime_state.yaml
    startup:
      configMap:
        annotations: {}
        name: botkube-startup-config
      fileName: _startup_state.yaml
  systemConfigMap:
    name: botkube-system
  upgradeNotifier: true

sources:
  argo:
    displayName: "Argo"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - argo
            - argo-events
            - argocd
        resources:
        # CRDs
        ## argocd
        - type: argoproj.io/v1alpha1/applications
          event:
            types:
            - error
          updateSetting:
            includeDiff: true
        - type: argoproj.io/v1alpha1/appprojects
          updateSetting:
            includeDiff: true
            fields:
              - spec
        ## argo-events
        - type: argoproj.io/v1alpha1/eventbus
          updateSetting:
            includeDiff: true
            fields:
              - spec.nats
        - type: argoproj.io/v1alpha1/eventsources
          updateSetting:
            includeDiff: true
            fields:
              - spec
        - type: argoproj.io/v1alpha1/sensors
          updateSetting:
            includeDiff: true
            fields:
              - spec
        ## argo-workflows
        - type: argoproj.io/v1alpha1/clusterworkflowtemplates
          updateSetting:
            includeDiff: true
            fields:
              - spec
        - type: argoproj.io/v1alpha1/workfloweventbindings
          updateSetting:
            includeDiff: true
            fields:
              - spec
        - type: argoproj.io/v1alpha1/workflows
          event: # INFO: created by ArgoEvents Sensors for each pipeline
            types:
            - error
          updateSetting:
            includeDiff: true
            fields:
              - spec
        - type: argoproj.io/v1alpha1/workflowtasksets
          updateSetting:
            includeDiff: true
            fields:
              - spec
        - type: argoproj.io/v1alpha1/workflowtemplates
          updateSetting:
            includeDiff: true
            fields:
              - spec
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  infra:
    displayName: "Infra"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:            
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - argo
            - argo-events
            - argocd
            - auditing
            - cicd-pipelines
            - default
            - ingress
            - kube-node-lease
            - kube-public
            - reloader
            - support
        resources:
          # K8s
          - type: v1/configmaps
            namespaces:
              exclude:
                # WARN: too many notifications because of new features
                - cicd-pipelines
                - cicd-pipelines-ast
                - cicd-pipelines-commons
                - cicd-pipelines-devops
                - cicd-pipelines-die
                - cicd-pipelines-investigations
                - cicd-pipelines-ip
                - cicd-pipelines-research
                - cicd-pipelines-sp
                - cicd-pipelines-tc
                - cicd-pipelines-tee
                - cicd-pipelines-ts
            updateSetting:
              includeDiff: true
              fields:
                - data
          - type: v1/pods
            namespaces:
              exclude:
                # WARN: too many notifications because of pipelines running
                - cicd-pipelines
                - cicd-pipelines-ast
                - cicd-pipelines-commons
                - cicd-pipelines-devops
                - cicd-pipelines-die
                - cicd-pipelines-investigations
                - cicd-pipelines-ip
                - cicd-pipelines-research
                - cicd-pipelines-sp
                - cicd-pipelines-tc
                - cicd-pipelines-tee
                - cicd-pipelines-ts
            updateSetting:
              includeDiff: true
              fields:
                - spec.serviceAccountName
                - spec.securityContext
                - spec.containers[*].image
                - spec.containers[*].resources
                - spec.containers[*].securityContext
          - type: apps/v1/daemonsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec
                # WARN: not always defined
                # - spec.template.spec.serviceAccountName
                # - spec.template.spec.securityContext
                # - spec.template.spec.containers[*].image
                # - spec.template.spec.containers[*].resources
                # - spec.template.spec.containers[*].securityContext
          - type: apps/v1/deployments
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
                # WARN: not always defined
                # - spec.template.spec.serviceAccountName
                # - spec.template.spec.securityContext
                # - spec.template.spec.containers[*].image
                # - spec.template.spec.containers[*].resources
                # - spec.template.spec.containers[*].securityContext
          - type: apps/v1/statefulsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
                # WARN: not always defined
                # - spec.template.spec.serviceAccountName
                # - spec.template.spec.securityContext
                # - spec.template.spec.containers[*].image
                # - spec.template.spec.containers[*].resources
                # - spec.template.spec.containers[*].securityContext
          - type: batch/v1/cronjobs
            updateSetting:
              includeDiff: true
              fields:
                - spec.suspend
                - spec.schedule
                - spec.jobTemplate.spec.template   # INFO: better stay more general
                # WARN: not always defined
                # - spec.jobTemplate.spec.template.serviceAccountName
                # - spec.jobTemplate.spec.template.securityContext
                # - spec.jobTemplate.spec.template.containers[*].image
                # - spec.jobTemplate.spec.template.containers[*].resources
                # - spec.jobTemplate.spec.template.containers[*].securityContext
          - type: batch/v1/jobs
            updateSetting:
              includeDiff: true
              fields:
                - status
                - status.startTime
                - status.completionTime
                - status.succeeded
                - status.conditions[*].status
          #       - status.conditions[*].type   # WARN: provided in examples but causing error "while finding value from jsonpath: \"status.conditions[*].type\" [..] conditions is not found"
          - type: v1/persistentvolumeclaims
            namespaces:
              exclude:
                # WARN: too many notifications because of pipelines running
                - cicd-pipelines
                - cicd-pipelines-ast
                - cicd-pipelines-commons
                - cicd-pipelines-devops
                - cicd-pipelines-die
                - cicd-pipelines-investigations
                - cicd-pipelines-ip
                - cicd-pipelines-research
                - cicd-pipelines-sp
                - cicd-pipelines-tc
                - cicd-pipelines-tee
                - cicd-pipelines-ts
            updateSetting:
              includeDiff: true
              fields:
                - spec.accessModes
                - spec.resources.requests.storage
          - type: v1/persistentvolumes
            updateSetting:
              includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  infra-monitoring:   # INFO: too many `pods` create and delete due to `cronjobs`
    displayName: "Monitoring infra"
    botkube/kubernetes:
      enabled: true
      config:          
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - monitoring
        resources:
          # K8s
          - type: v1/configmaps
            updateSetting:
              includeDiff: true
              fields:
                - data
          # - type: v1/secrets
          #   updateSetting:
          #     includeDiff: true
          #     fields:
          #       - data
          - type: v1/pods
            event:   # WARN: too many `update` notifications
              types:
              - update
              - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.serviceAccountName
                - spec.securityContext
                - spec.containers[*].image
                - spec.containers[*].resources
                - spec.containers[*].securityContext
          - type: apps/v1/daemonsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec
          - type: apps/v1/deployments
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
          - type: apps/v1/statefulsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
          - type: batch/v1/cronjobs
            updateSetting:
              includeDiff: true
              fields:
                - spec.suspend
                - spec.schedule
                - spec.jobTemplate.spec.template
          - type: v1/persistentvolumeclaims
            updateSetting:
              includeDiff: true
              fields:
                - spec.accessModes
                - spec.resources.requests.storage
          - type: v1/persistentvolumes
            updateSetting:
              includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  infra-logging:   # INFO: too many `configmaps` updates
    displayName: "Logging infra"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - logging
        resources:
          # K8s
          - type: v1/configmaps
            event:   # WARN: too many `update` notifications
              types:
              - create
              - delete
              - error
            updateSetting:
              includeDiff: true
              # fields:   # WARN: missing field `data` for `logging-operator.logging.banzaicloud.io`
              #   - data
          - type: v1/pods
            updateSetting:
              includeDiff: true
              fields:
                - spec.serviceAccountName
                - spec.securityContext
                - spec.containers[*].image
                - spec.containers[*].resources
                - spec.containers[*].securityContext
          - type: apps/v1/daemonsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec
          - type: apps/v1/deployments
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
          - type: apps/v1/statefulsets
            updateSetting:
              includeDiff: true
              fields:
                - spec.replicas
                - spec.template.spec
          - type: batch/v1/cronjobs
            updateSetting:
              includeDiff: true
              fields:
                - spec.suspend
                - spec.schedule
                - spec.jobTemplate.spec.template
          - type: v1/persistentvolumeclaims
            updateSetting:
              includeDiff: true
              fields:
                - spec.accessModes
                - spec.resources.requests.storage
          - type: v1/persistentvolumes
            updateSetting:
              includeDiff: true
      # WARN: too many errors like
      #   - failed to garbage collect required amount of images
      #   - {"unmanaged": {"net.core.bpf_jit_harden": "0", "net.netfilter.nf_conntrack_buckets": "131072"}}
      #   - Memory cgroup out of memory: Killed process
      # nodes:
      #   displayName: "Nodes"
      #   botkube/kubernetes:
      #     events:
      #       - all
      #     namespaces:
      #       include:
      #         - ".*"
      #     resources:
      #       # K8s
      #       - name: v1/nodes
      #         updateSetting:
      #           includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  namespaces:
    displayName: "Namespaces"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - ".*"
        resources:
        # K8s
        - type: v1/namespaces
          updateSetting:
            includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  security:
    displayName: "Security"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - ".*"
          exclude:   # WARN: take precedence over `include`
            - kube-system
        resources:
          # K8s
          - type: v1/serviceaccounts
            events:   # WARN: too many `update` notifications
              - create
              - delete
              - error
            updateSetting:
              includeDiff: true
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
            updateSetting:
              includeDiff: true
              fields:
                - roleRef
                - subjects[*]
                # WARN: not always defined
                # - subjects[*].name
                # - subjects[*].namespace
          - type: rbac.authorization.k8s.io/v1/clusterroles
            updateSetting:
              includeDiff: true
              fields:
                - rules
          - type: rbac.authorization.k8s.io/v1/rolebindings
            updateSetting:
              includeDiff: true
              fields:
                - roleRef
                - subjects[*]
                # WARN: not always defined
                # - subjects[*].name
                # - subjects[*].namespace
          - type: rbac.authorization.k8s.io/v1/roles
            updateSetting:
              includeDiff: true
              fields:
                - rules
          - type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
            updateSetting:
              includeDiff: true
          - type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
            updateSetting:
              includeDiff: true
          - type: networking.k8s.io/v1/ingresses
            updateSetting:
              includeDiff: true
              fields:
                - spec.rules   # array
          - type: networking.k8s.io/v1/networkpolicies
            updateSetting:
              includeDiff: true
              fields:
                - spec.podSelector
                - spec.policyTypes   # array
                # WARN: not always defined
                # - spec.egress   # array
                # - spec.ingress   # array
          # CRDs
          ## gcp/gke
          - type: cloud.google.com/v1/backendconfigs
            updateSetting:
              includeDiff: true
              fields:
                - spec
                # WARN: not always defined
                # - spec.healthCheck
                # - spec.iap
          - type: cloud.google.com/v1beta1/backendconfigs
            updateSetting:
              includeDiff: true
              fields:
                - spec
                # WARN: not always defined
                # - spec.healthCheck
                # - spec.iap
          # - type: networking.gke.io/v1/frontendconfigs
          #   updateSetting:
          #     includeDiff: true
          #     fields:
          #       - spec.redirectToHttps
          - type: networking.gke.io/v1beta1/frontendconfigs   # DEPRECATED
            updateSetting:
              includeDiff: true
              fields:
                - spec.redirectToHttps
          - type: networking.gke.io/v1/managedcertificates
            updateSetting:
              includeDiff: true
              fields:
                - spec.domains   # array
          # - type: networking.gke.io/v1/serviceattachments
          #   updateSetting:
          #     includeDiff: true
          # - type: networking.gke.io/v1/servicenetworkendpointgroups
          #   updateSetting:
          #     includeDiff: true
          # - type: hub.gke.io/v1/memberships
          #   updateSetting:
          #     includeDiff: true
          - type: nodemanagement.gke.io/v1alpha1/updateinfos
            updateSetting:
              includeDiff: true
          ## calico
          - type: crd.projectcalico.org/v1/bgpconfigurations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/bgppeers
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/blockaffinities
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/clusterinformations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/felixconfigurations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/globalnetworkpolicies
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/globalnetworksets
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/hostendpoints
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamblocks
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamconfigs
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamhandles
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ippools
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/networkpolicies
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/networksets
            updateSetting:
              includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  security-kube-system:
    displayName: "kube-system security"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - create
          - update
          - delete
          - error
        namespaces:
          include:
            - kube-system
        resources:
          # K8s
          - type: v1/serviceaccounts
            event:   # WARN: too many `update` notifications
              types:
              - create
              - delete
              - error
            updateSetting:
              includeDiff: true
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
            updateSetting:
              includeDiff: true
              fields:
                - roleRef
                - subjects[*]
                # WARN: not always defined
                # - subjects[*].name
                # - subjects[*].namespace
          - type: rbac.authorization.k8s.io/v1/clusterroles
            updateSetting:
              includeDiff: true
              fields:
                - rules
          - type: rbac.authorization.k8s.io/v1/rolebindings
            updateSetting:
              includeDiff: true
              fields:
                - roleRef
                - subjects[*]
                # WARN: not always defined
                # - subjects[*].name
                # - subjects[*].namespace
          - type: rbac.authorization.k8s.io/v1/roles
            updateSetting:
              includeDiff: true
              fields:
                - rules
          - type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
            updateSetting:
              includeDiff: true
          - type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
            updateSetting:
              includeDiff: true
          - type: networking.k8s.io/v1/ingresses
            updateSetting:
              includeDiff: true
              fields:
                - spec.rules   # array
          - type: networking.k8s.io/v1/networkpolicies
            updateSetting:
              includeDiff: true
              fields:
                - spec.podSelector
                - spec.policyTypes   # array
          # CRDs
          ## calico
          - type: crd.projectcalico.org/v1/bgpconfigurations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/bgppeers
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/blockaffinities
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/clusterinformations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/felixconfigurations
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/globalnetworkpolicies
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/globalnetworksets
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/hostendpoints
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamblocks
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamconfigs
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ipamhandles
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/ippools
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/networkpolicies
            updateSetting:
              includeDiff: true
          - type: crd.projectcalico.org/v1/networksets
            updateSetting:
              includeDiff: true
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  error:
    displayName: "Errors"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - error
        namespaces:
          include:
            - ".*"
          exclude:   # WARN: take precedence over `include`
            - kube-system
        resources:
          # K8s
          - type: v1/configmaps
          - type: v1/endpoints
          - type: v1/limitranges
          - type: v1/persistentvolumeclaims
          - type: v1/persistentvolumes
          # WARN: too many errors like "v1/pods error" (various reasons)
          # - type: v1/pods
          #   namespaces:
          #     include:
          #       - ".*"
          #     exclude:   # INFO: excluding namespaces in which cronjobs/jobs run, already covered by `*-jobs` sources
          #       - cicd-pipelines
          - type: v1/resourcequotas
          # - type: v1/secrets
          - type: v1/serviceaccounts
          - type: v1/services
          - type: apps/v1/daemonsets
          - type: apps/v1/deployments
          - type: apps/v1/statefulsets
          - type: batch/v1/cronjobs
          # - type: batch/v1/jobs
          - type: policy/v1/poddisruptionbudgets
          - type: apiextensions.k8s.io/v1/customresourcedefinitions
          - type: autoscaling/v1/horizontalpodautoscalers
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
          - type: rbac.authorization.k8s.io/v1/clusterroles
          - type: rbac.authorization.k8s.io/v1/rolebindings
          - type: rbac.authorization.k8s.io/v1/roles
          - type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
          - type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
          - type: networking.k8s.io/v1/ingresses
          - type: networking.k8s.io/v1/networkpolicies
          - type: scheduling.k8s.io/v1/priorityclasses
          - type: storage.k8s.io/v1/storageclasses
          - type: snapshot.storage.k8s.io/v1/volumesnapshotclasses
          - type: snapshot.storage.k8s.io/v1/volumesnapshotcontents
          - type: snapshot.storage.k8s.io/v1/volumesnapshots
          # CRDs
          ## gcp/gke
          - type: cloud.google.com/v1/backendconfigs
          - type: cloud.google.com/v1beta1/backendconfigs
          # - type: networking.gke.io/v1/frontendconfigs
          - type: networking.gke.io/v1/managedcertificates
          # - type: networking.gke.io/v1/serviceattachments
          # - type: networking.gke.io/v1/servicenetworkendpointgroups
          # - type: hub.gke.io/v1/memberships
          - type: nodemanagement.gke.io/v1alpha1/updateinfos
          ## calico
          - type: crd.projectcalico.org/v1/bgpconfigurations
          - type: crd.projectcalico.org/v1/bgppeers
          - type: crd.projectcalico.org/v1/blockaffinities
          - type: crd.projectcalico.org/v1/clusterinformations
          - type: crd.projectcalico.org/v1/felixconfigurations
          - type: crd.projectcalico.org/v1/globalnetworkpolicies
          - type: crd.projectcalico.org/v1/globalnetworksets
          - type: crd.projectcalico.org/v1/hostendpoints
          - type: crd.projectcalico.org/v1/ipamblocks
          - type: crd.projectcalico.org/v1/ipamconfigs
          - type: crd.projectcalico.org/v1/ipamhandles
          - type: crd.projectcalico.org/v1/ippools
          - type: crd.projectcalico.org/v1/networkpolicies
          - type: crd.projectcalico.org/v1/networksets
          ## argocd
          - type: argoproj.io/v1alpha1/applications
            namespaces:
              exclude:
                - argocd
          - type: argoproj.io/v1alpha1/appprojects
          ## argo-events
          - type: argoproj.io/v1alpha1/eventbus
          - type: argoproj.io/v1alpha1/eventsources
          - type: argoproj.io/v1alpha1/sensors
          ## argo-workflows
          - type: argoproj.io/v1alpha1/clusterworkflowtemplates
          - type: argoproj.io/v1alpha1/workfloweventbindings
          - type: argoproj.io/v1alpha1/workflows
            namespaces:
              exclude:
                - cicd-pipelines   # WARN: too many notifications because of pipelines running
          - type: argoproj.io/v1alpha1/workflowtasksets
          - type: argoproj.io/v1alpha1/workflowtemplates
          ## monitoring - DEPRECATED: replace with VictoriaMetrics
          - type: monitoring.coreos.com/v1/alertmanagerconfigs
          - type: monitoring.coreos.com/v1/alertmanagers
          - type: monitoring.coreos.com/v1/podmonitors
          - type: monitoring.coreos.com/v1/probes
          - type: monitoring.coreos.com/v1/prometheuses
          - type: monitoring.coreos.com/v1/prometheusrules
          - type: monitoring.coreos.com/v1/servicemonitors
          - type: monitoring.coreos.com/v1/thanosrulers
          ## logging
          - type: logging.banzaicloud.io/v1beta1/clusterflows
          - type: logging.banzaicloud.io/v1beta1/clusteroutputs
          - type: logging.banzaicloud.io/v1beta1/flows
          - type: logging.banzaicloud.io/v1beta1/loggings
          - type: logging.banzaicloud.io/v1beta1/outputs
          - type: logging-extensions.banzaicloud.io/v1alpha1/eventtailers
          - type: logging-extensions.banzaicloud.io/v1alpha1/hosttailers
          ## treafik
          - type: traefik.containo.us/v1alpha1/ingressroutes
          - type: traefik.containo.us/v1alpha1/ingressroutetcps
          - type: traefik.containo.us/v1alpha1/ingressrouteudps
          - type: traefik.containo.us/v1alpha1/middlewares
          - type: traefik.containo.us/v1alpha1/middlewaretcps
          - type: traefik.containo.us/v1alpha1/serverstransports
          - type: traefik.containo.us/v1alpha1/tlsoptions
          - type: traefik.containo.us/v1alpha1/tlsstores
          - type: traefik.containo.us/v1alpha1/traefikservices
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static
  error-kube-system:
    displayName: "kube-system errors"
    botkube/kubernetes:
      enabled: true
      config:
        event:
          types:
          - error
        namespaces:
          include:
            - kube-system
        resources:
          # K8s
          - type: v1/limitranges
          - type: v1/resourcequotas
          - type: v1/serviceaccounts
          - type: policy/v1/poddisruptionbudgets
          - type: apiextensions.k8s.io/v1/customresourcedefinitions
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
          - type: rbac.authorization.k8s.io/v1/clusterroles
          - type: rbac.authorization.k8s.io/v1/rolebindings
          - type: rbac.authorization.k8s.io/v1/roles
          - type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
          - type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
          - type: networking.k8s.io/v1/ingresses
          - type: networking.k8s.io/v1/networkpolicies
          - type: scheduling.k8s.io/v1/priorityclasses
          # CRDs
          ## gcp/gke
          - type: cloud.google.com/v1/backendconfigs
          - type: cloud.google.com/v1beta1/backendconfigs
          # - type: networking.gke.io/v1/frontendconfigs
          - type: networking.gke.io/v1/managedcertificates
          # - type: networking.gke.io/v1/serviceattachments
          # - type: networking.gke.io/v1/servicenetworkendpointgroups
          # - type: hub.gke.io/v1/memberships
          - type: nodemanagement.gke.io/v1alpha1/updateinfos
          ## calico
          - type: crd.projectcalico.org/v1/bgpconfigurations
          - type: crd.projectcalico.org/v1/bgppeers
          - type: crd.projectcalico.org/v1/blockaffinities
          - type: crd.projectcalico.org/v1/clusterinformations
          - type: crd.projectcalico.org/v1/felixconfigurations
          - type: crd.projectcalico.org/v1/globalnetworkpolicies
          - type: crd.projectcalico.org/v1/globalnetworksets
          - type: crd.projectcalico.org/v1/hostendpoints
          - type: crd.projectcalico.org/v1/ipamblocks
          - type: crd.projectcalico.org/v1/ipamconfigs
          - type: crd.projectcalico.org/v1/ipamhandles
          - type: crd.projectcalico.org/v1/ippools
          - type: crd.projectcalico.org/v1/networkpolicies
          - type: crd.projectcalico.org/v1/networksets
      context:
        rbac:
          group:
            prefix: ""
            static:
              values:
              - botkube-plugins-default
            type: Static

configWatcher:
  enabled: true
  inCluster:
    informerResyncPeriod: 10m

plugins:
  cacheDir: /tmp
  repositories:
    botkube:
      url: https://github.com/kubeshop/botkube/releases/download/v1.8.0/plugins-index.yaml
  incomingWebhook:
    enabled: true
    # port and baseInClusterURL are set via envs
  restartPolicy:
    type: DeactivatePlugin
    threshold: 10
  healthCheckInterval: 10s

analytics:
  disable: false

Slack threads

Versions

Kubernetes: v1.28.4-gke.1083000
Platform: GKE
Botkube: 1.8.0

@mszostok posted this answer in Slack thread https://botkube.slack.com/archives/C01CR1KS55K/p1709121554586339

Thank you for upgrading! I've reviewed your configuration, and I agree with your observation about its similarity to the previous report.

I think that I know what could cause that. The common thing is that you have a lot of dedicated k8s source configurations, and you are also watching a lot of different resources. It's great to see such usage ๐ŸŽ‰

I believe the root cause is directly connected with changed kubernetes source implementation. In our current approach, a new shared informer is created for each Kubernetes source (see: [source.go#L122]). Unlike the previous method, which had a single informer based on merged all configurations, the current approach results in multiple in-memory informers for each configuration - making it N times.

To address this, I think that we should refactor our approach and introduce a global informer, similar to the one we had in v0.18. This should help resolve the issue.

Could you please create a GitHub issue for this? We'll further refine it and try include it in our next planning ๐Ÿ‘ However, since the kubernetes code is fully open source, if you find the time to experiment with refactoring and testing this approach, it would it would be amazing! ๐Ÿ™‚ If not, we will try to handle it in our team and keep you posted ๐Ÿ‘

Hi @bygui86, good news! The issue has been resolved in #1425 which is now merged to the main branch.

The Botkube 1.10 release is planned for next week but, in the meantime, could you please test Botkube from the latest main to confirm your issue is gone?

The following command should do the trick:

botkube install --repo https://storage.googleapis.com/botkube-latest-main-charts --version 0.0.0-2cb5eacc # add your own overrides here

In case you use other way for the Helm chart installation, here are the details:

  • Helm repo: https://storage.googleapis.com/botkube-latest-main-charts
  • Helm chart version: 0.0.0-2cb5eacc

However, remember to keep the following overrides:

image:
  pullPolicy: IfNotPresent
  registry: ghcr.io
  repository: kubeshop/botkube
  tag: 0.0.0-2cb5eacc # use the latest Botkube Agent image

plugins:
  repositories:
    botkube:
      url: https://storage.googleapis.com/botkube-plugins-latest/plugins-index.yaml # use the latest plugins

Hope that will fix your issue and you'll be able to upgrade to the latest Botkube. Cheers!

@pkosiec thanks a lot for your effort! The memory consumption in the MR looks really promising!
Unfortunately I don't have time to test it in the next few days :( I will try to upgrade to v1.10.x when I get a spot :)
Keep you posted!