Increased resources consumption configuring some sources
bygui86 opened this issue ยท comments
Before you submit the issue
- Search open and closed issues for duplicates.
- Read the contributing guidelines (CONTRIBUTING.md file on root of the repository).
- Ask in Slack channel
helping-hands
.
Description
We use Botkube mainly to receive alerts based on specific events.
Migrating from 0.18.0 to 1.8.0, we translated the configuration from old (0.18) to new (1.8) syntax, but we face some issues:
- we see error logs same as described in the Slack thread https://botkube.slack.com/archives/C01CR1KS55K/p1708685516373879.
- Botkube keeps restarts because of livenessProbe failing (sometimes because of OOMKilled during startup, but this doesnโt occur that often), but we increased resources like: requests 1000m/1Gi, limits 3000m/3Gi. We set those values based on Grafana spikes we observed. To be honest they look really high, especially if compared to version 0.18! Actually it seems that the adoption of the plugin model increased exponentially the resources usage ๐
Please note that the above mentioned issues happen also with the default global_config coming out of the helm-chart with following sources enabled: k8s-all-events, k8s-create-events, k8s-err-events, k8s-err-with-logs-events, k8s-recommendation-events
Expected behavior
Even with many configured sources, Botkube should consume up to 1000m cpu and 1Gi mem, like in older versions <= 1.0.0
Actual behavior
With many configured sources, Botkube consumes up to 3000m cpu and 3Gi mem.
Steps to reproduce
Just deploy Botkube using following configuration:
expand
global_config.yaml
executors:
bins-management:
botkube/exec:
config:
templates:
- ref: github.com/kubeshop/botkube//cmd/executor/exec/templates?ref=v1.8.0
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
displayName: Exec
enabled: false
k8s-default-tools:
botkube/helm:
config:
defaultNamespace: default
helmCacheDir: /tmp/helm/.cache
helmConfigDir: /tmp/helm/
helmDriver: secret
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
displayName: Helm
enabled: false
botkube/kubectl:
config:
defaultNamespace: default
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
displayName: Kubectl
enabled: true
aliases:
k:
command: kubectl
displayName: Kubectl alias
kc:
command: kubectl
displayName: Kubectl alias
x:
command: exec
displayName: Exec alias
actions:
describe-created-resource:
bindings:
executors:
- k8s-default-tools
sources:
- k8s-create-events
command: kubectl describe {{ .Event.Kind | lower }}{{ if .Event.Namespace }} -n
{{ .Event.Namespace }}{{ end }} {{ .Event.Name }}
displayName: Describe created resource
enabled: false
show-logs-on-error:
bindings:
executors:
- k8s-default-tools
sources:
- k8s-err-with-logs-events
command: kubectl logs {{ .Event.Kind | lower }}/{{ .Event.Name }} -n {{ .Event.Namespace
}}
displayName: Show logs on error
enabled: false
settings:
clusterName: ceiba-cicd
healthPort: 2114
lifecycleServer:
enabled: true
port: 2113
log:
disableColors: false
formatter: json
level: info
persistentConfig:
runtime:
configMap:
annotations: {}
name: botkube-runtime-config
fileName: _runtime_state.yaml
startup:
configMap:
annotations: {}
name: botkube-startup-config
fileName: _startup_state.yaml
systemConfigMap:
name: botkube-system
upgradeNotifier: true
sources:
argo:
displayName: "Argo"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- argo
- argo-events
- argocd
resources:
# CRDs
## argocd
- type: argoproj.io/v1alpha1/applications
event:
types:
- error
updateSetting:
includeDiff: true
- type: argoproj.io/v1alpha1/appprojects
updateSetting:
includeDiff: true
fields:
- spec
## argo-events
- type: argoproj.io/v1alpha1/eventbus
updateSetting:
includeDiff: true
fields:
- spec.nats
- type: argoproj.io/v1alpha1/eventsources
updateSetting:
includeDiff: true
fields:
- spec
- type: argoproj.io/v1alpha1/sensors
updateSetting:
includeDiff: true
fields:
- spec
## argo-workflows
- type: argoproj.io/v1alpha1/clusterworkflowtemplates
updateSetting:
includeDiff: true
fields:
- spec
- type: argoproj.io/v1alpha1/workfloweventbindings
updateSetting:
includeDiff: true
fields:
- spec
- type: argoproj.io/v1alpha1/workflows
event: # INFO: created by ArgoEvents Sensors for each pipeline
types:
- error
updateSetting:
includeDiff: true
fields:
- spec
- type: argoproj.io/v1alpha1/workflowtasksets
updateSetting:
includeDiff: true
fields:
- spec
- type: argoproj.io/v1alpha1/workflowtemplates
updateSetting:
includeDiff: true
fields:
- spec
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
infra:
displayName: "Infra"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- argo
- argo-events
- argocd
- auditing
- cicd-pipelines
- default
- ingress
- kube-node-lease
- kube-public
- reloader
- support
resources:
# K8s
- type: v1/configmaps
namespaces:
exclude:
# WARN: too many notifications because of new features
- cicd-pipelines
- cicd-pipelines-ast
- cicd-pipelines-commons
- cicd-pipelines-devops
- cicd-pipelines-die
- cicd-pipelines-investigations
- cicd-pipelines-ip
- cicd-pipelines-research
- cicd-pipelines-sp
- cicd-pipelines-tc
- cicd-pipelines-tee
- cicd-pipelines-ts
updateSetting:
includeDiff: true
fields:
- data
- type: v1/pods
namespaces:
exclude:
# WARN: too many notifications because of pipelines running
- cicd-pipelines
- cicd-pipelines-ast
- cicd-pipelines-commons
- cicd-pipelines-devops
- cicd-pipelines-die
- cicd-pipelines-investigations
- cicd-pipelines-ip
- cicd-pipelines-research
- cicd-pipelines-sp
- cicd-pipelines-tc
- cicd-pipelines-tee
- cicd-pipelines-ts
updateSetting:
includeDiff: true
fields:
- spec.serviceAccountName
- spec.securityContext
- spec.containers[*].image
- spec.containers[*].resources
- spec.containers[*].securityContext
- type: apps/v1/daemonsets
updateSetting:
includeDiff: true
fields:
- spec.template.spec
# WARN: not always defined
# - spec.template.spec.serviceAccountName
# - spec.template.spec.securityContext
# - spec.template.spec.containers[*].image
# - spec.template.spec.containers[*].resources
# - spec.template.spec.containers[*].securityContext
- type: apps/v1/deployments
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
# WARN: not always defined
# - spec.template.spec.serviceAccountName
# - spec.template.spec.securityContext
# - spec.template.spec.containers[*].image
# - spec.template.spec.containers[*].resources
# - spec.template.spec.containers[*].securityContext
- type: apps/v1/statefulsets
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
# WARN: not always defined
# - spec.template.spec.serviceAccountName
# - spec.template.spec.securityContext
# - spec.template.spec.containers[*].image
# - spec.template.spec.containers[*].resources
# - spec.template.spec.containers[*].securityContext
- type: batch/v1/cronjobs
updateSetting:
includeDiff: true
fields:
- spec.suspend
- spec.schedule
- spec.jobTemplate.spec.template # INFO: better stay more general
# WARN: not always defined
# - spec.jobTemplate.spec.template.serviceAccountName
# - spec.jobTemplate.spec.template.securityContext
# - spec.jobTemplate.spec.template.containers[*].image
# - spec.jobTemplate.spec.template.containers[*].resources
# - spec.jobTemplate.spec.template.containers[*].securityContext
- type: batch/v1/jobs
updateSetting:
includeDiff: true
fields:
- status
- status.startTime
- status.completionTime
- status.succeeded
- status.conditions[*].status
# - status.conditions[*].type # WARN: provided in examples but causing error "while finding value from jsonpath: \"status.conditions[*].type\" [..] conditions is not found"
- type: v1/persistentvolumeclaims
namespaces:
exclude:
# WARN: too many notifications because of pipelines running
- cicd-pipelines
- cicd-pipelines-ast
- cicd-pipelines-commons
- cicd-pipelines-devops
- cicd-pipelines-die
- cicd-pipelines-investigations
- cicd-pipelines-ip
- cicd-pipelines-research
- cicd-pipelines-sp
- cicd-pipelines-tc
- cicd-pipelines-tee
- cicd-pipelines-ts
updateSetting:
includeDiff: true
fields:
- spec.accessModes
- spec.resources.requests.storage
- type: v1/persistentvolumes
updateSetting:
includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
infra-monitoring: # INFO: too many `pods` create and delete due to `cronjobs`
displayName: "Monitoring infra"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- monitoring
resources:
# K8s
- type: v1/configmaps
updateSetting:
includeDiff: true
fields:
- data
# - type: v1/secrets
# updateSetting:
# includeDiff: true
# fields:
# - data
- type: v1/pods
event: # WARN: too many `update` notifications
types:
- update
- error
updateSetting:
includeDiff: true
fields:
- spec.serviceAccountName
- spec.securityContext
- spec.containers[*].image
- spec.containers[*].resources
- spec.containers[*].securityContext
- type: apps/v1/daemonsets
updateSetting:
includeDiff: true
fields:
- spec.template.spec
- type: apps/v1/deployments
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
- type: apps/v1/statefulsets
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
- type: batch/v1/cronjobs
updateSetting:
includeDiff: true
fields:
- spec.suspend
- spec.schedule
- spec.jobTemplate.spec.template
- type: v1/persistentvolumeclaims
updateSetting:
includeDiff: true
fields:
- spec.accessModes
- spec.resources.requests.storage
- type: v1/persistentvolumes
updateSetting:
includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
infra-logging: # INFO: too many `configmaps` updates
displayName: "Logging infra"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- logging
resources:
# K8s
- type: v1/configmaps
event: # WARN: too many `update` notifications
types:
- create
- delete
- error
updateSetting:
includeDiff: true
# fields: # WARN: missing field `data` for `logging-operator.logging.banzaicloud.io`
# - data
- type: v1/pods
updateSetting:
includeDiff: true
fields:
- spec.serviceAccountName
- spec.securityContext
- spec.containers[*].image
- spec.containers[*].resources
- spec.containers[*].securityContext
- type: apps/v1/daemonsets
updateSetting:
includeDiff: true
fields:
- spec.template.spec
- type: apps/v1/deployments
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
- type: apps/v1/statefulsets
updateSetting:
includeDiff: true
fields:
- spec.replicas
- spec.template.spec
- type: batch/v1/cronjobs
updateSetting:
includeDiff: true
fields:
- spec.suspend
- spec.schedule
- spec.jobTemplate.spec.template
- type: v1/persistentvolumeclaims
updateSetting:
includeDiff: true
fields:
- spec.accessModes
- spec.resources.requests.storage
- type: v1/persistentvolumes
updateSetting:
includeDiff: true
# WARN: too many errors like
# - failed to garbage collect required amount of images
# - {"unmanaged": {"net.core.bpf_jit_harden": "0", "net.netfilter.nf_conntrack_buckets": "131072"}}
# - Memory cgroup out of memory: Killed process
# nodes:
# displayName: "Nodes"
# botkube/kubernetes:
# events:
# - all
# namespaces:
# include:
# - ".*"
# resources:
# # K8s
# - name: v1/nodes
# updateSetting:
# includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
namespaces:
displayName: "Namespaces"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- ".*"
resources:
# K8s
- type: v1/namespaces
updateSetting:
includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
security:
displayName: "Security"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- ".*"
exclude: # WARN: take precedence over `include`
- kube-system
resources:
# K8s
- type: v1/serviceaccounts
events: # WARN: too many `update` notifications
- create
- delete
- error
updateSetting:
includeDiff: true
- type: rbac.authorization.k8s.io/v1/clusterrolebindings
updateSetting:
includeDiff: true
fields:
- roleRef
- subjects[*]
# WARN: not always defined
# - subjects[*].name
# - subjects[*].namespace
- type: rbac.authorization.k8s.io/v1/clusterroles
updateSetting:
includeDiff: true
fields:
- rules
- type: rbac.authorization.k8s.io/v1/rolebindings
updateSetting:
includeDiff: true
fields:
- roleRef
- subjects[*]
# WARN: not always defined
# - subjects[*].name
# - subjects[*].namespace
- type: rbac.authorization.k8s.io/v1/roles
updateSetting:
includeDiff: true
fields:
- rules
- type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
updateSetting:
includeDiff: true
- type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
updateSetting:
includeDiff: true
- type: networking.k8s.io/v1/ingresses
updateSetting:
includeDiff: true
fields:
- spec.rules # array
- type: networking.k8s.io/v1/networkpolicies
updateSetting:
includeDiff: true
fields:
- spec.podSelector
- spec.policyTypes # array
# WARN: not always defined
# - spec.egress # array
# - spec.ingress # array
# CRDs
## gcp/gke
- type: cloud.google.com/v1/backendconfigs
updateSetting:
includeDiff: true
fields:
- spec
# WARN: not always defined
# - spec.healthCheck
# - spec.iap
- type: cloud.google.com/v1beta1/backendconfigs
updateSetting:
includeDiff: true
fields:
- spec
# WARN: not always defined
# - spec.healthCheck
# - spec.iap
# - type: networking.gke.io/v1/frontendconfigs
# updateSetting:
# includeDiff: true
# fields:
# - spec.redirectToHttps
- type: networking.gke.io/v1beta1/frontendconfigs # DEPRECATED
updateSetting:
includeDiff: true
fields:
- spec.redirectToHttps
- type: networking.gke.io/v1/managedcertificates
updateSetting:
includeDiff: true
fields:
- spec.domains # array
# - type: networking.gke.io/v1/serviceattachments
# updateSetting:
# includeDiff: true
# - type: networking.gke.io/v1/servicenetworkendpointgroups
# updateSetting:
# includeDiff: true
# - type: hub.gke.io/v1/memberships
# updateSetting:
# includeDiff: true
- type: nodemanagement.gke.io/v1alpha1/updateinfos
updateSetting:
includeDiff: true
## calico
- type: crd.projectcalico.org/v1/bgpconfigurations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/bgppeers
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/blockaffinities
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/clusterinformations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/felixconfigurations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/globalnetworkpolicies
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/globalnetworksets
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/hostendpoints
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamblocks
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamconfigs
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamhandles
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ippools
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/networkpolicies
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/networksets
updateSetting:
includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
security-kube-system:
displayName: "kube-system security"
botkube/kubernetes:
enabled: true
config:
event:
types:
- create
- update
- delete
- error
namespaces:
include:
- kube-system
resources:
# K8s
- type: v1/serviceaccounts
event: # WARN: too many `update` notifications
types:
- create
- delete
- error
updateSetting:
includeDiff: true
- type: rbac.authorization.k8s.io/v1/clusterrolebindings
updateSetting:
includeDiff: true
fields:
- roleRef
- subjects[*]
# WARN: not always defined
# - subjects[*].name
# - subjects[*].namespace
- type: rbac.authorization.k8s.io/v1/clusterroles
updateSetting:
includeDiff: true
fields:
- rules
- type: rbac.authorization.k8s.io/v1/rolebindings
updateSetting:
includeDiff: true
fields:
- roleRef
- subjects[*]
# WARN: not always defined
# - subjects[*].name
# - subjects[*].namespace
- type: rbac.authorization.k8s.io/v1/roles
updateSetting:
includeDiff: true
fields:
- rules
- type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
updateSetting:
includeDiff: true
- type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
updateSetting:
includeDiff: true
- type: networking.k8s.io/v1/ingresses
updateSetting:
includeDiff: true
fields:
- spec.rules # array
- type: networking.k8s.io/v1/networkpolicies
updateSetting:
includeDiff: true
fields:
- spec.podSelector
- spec.policyTypes # array
# CRDs
## calico
- type: crd.projectcalico.org/v1/bgpconfigurations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/bgppeers
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/blockaffinities
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/clusterinformations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/felixconfigurations
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/globalnetworkpolicies
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/globalnetworksets
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/hostendpoints
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamblocks
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamconfigs
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ipamhandles
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/ippools
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/networkpolicies
updateSetting:
includeDiff: true
- type: crd.projectcalico.org/v1/networksets
updateSetting:
includeDiff: true
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
error:
displayName: "Errors"
botkube/kubernetes:
enabled: true
config:
event:
types:
- error
namespaces:
include:
- ".*"
exclude: # WARN: take precedence over `include`
- kube-system
resources:
# K8s
- type: v1/configmaps
- type: v1/endpoints
- type: v1/limitranges
- type: v1/persistentvolumeclaims
- type: v1/persistentvolumes
# WARN: too many errors like "v1/pods error" (various reasons)
# - type: v1/pods
# namespaces:
# include:
# - ".*"
# exclude: # INFO: excluding namespaces in which cronjobs/jobs run, already covered by `*-jobs` sources
# - cicd-pipelines
- type: v1/resourcequotas
# - type: v1/secrets
- type: v1/serviceaccounts
- type: v1/services
- type: apps/v1/daemonsets
- type: apps/v1/deployments
- type: apps/v1/statefulsets
- type: batch/v1/cronjobs
# - type: batch/v1/jobs
- type: policy/v1/poddisruptionbudgets
- type: apiextensions.k8s.io/v1/customresourcedefinitions
- type: autoscaling/v1/horizontalpodautoscalers
- type: rbac.authorization.k8s.io/v1/clusterrolebindings
- type: rbac.authorization.k8s.io/v1/clusterroles
- type: rbac.authorization.k8s.io/v1/rolebindings
- type: rbac.authorization.k8s.io/v1/roles
- type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
- type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
- type: networking.k8s.io/v1/ingresses
- type: networking.k8s.io/v1/networkpolicies
- type: scheduling.k8s.io/v1/priorityclasses
- type: storage.k8s.io/v1/storageclasses
- type: snapshot.storage.k8s.io/v1/volumesnapshotclasses
- type: snapshot.storage.k8s.io/v1/volumesnapshotcontents
- type: snapshot.storage.k8s.io/v1/volumesnapshots
# CRDs
## gcp/gke
- type: cloud.google.com/v1/backendconfigs
- type: cloud.google.com/v1beta1/backendconfigs
# - type: networking.gke.io/v1/frontendconfigs
- type: networking.gke.io/v1/managedcertificates
# - type: networking.gke.io/v1/serviceattachments
# - type: networking.gke.io/v1/servicenetworkendpointgroups
# - type: hub.gke.io/v1/memberships
- type: nodemanagement.gke.io/v1alpha1/updateinfos
## calico
- type: crd.projectcalico.org/v1/bgpconfigurations
- type: crd.projectcalico.org/v1/bgppeers
- type: crd.projectcalico.org/v1/blockaffinities
- type: crd.projectcalico.org/v1/clusterinformations
- type: crd.projectcalico.org/v1/felixconfigurations
- type: crd.projectcalico.org/v1/globalnetworkpolicies
- type: crd.projectcalico.org/v1/globalnetworksets
- type: crd.projectcalico.org/v1/hostendpoints
- type: crd.projectcalico.org/v1/ipamblocks
- type: crd.projectcalico.org/v1/ipamconfigs
- type: crd.projectcalico.org/v1/ipamhandles
- type: crd.projectcalico.org/v1/ippools
- type: crd.projectcalico.org/v1/networkpolicies
- type: crd.projectcalico.org/v1/networksets
## argocd
- type: argoproj.io/v1alpha1/applications
namespaces:
exclude:
- argocd
- type: argoproj.io/v1alpha1/appprojects
## argo-events
- type: argoproj.io/v1alpha1/eventbus
- type: argoproj.io/v1alpha1/eventsources
- type: argoproj.io/v1alpha1/sensors
## argo-workflows
- type: argoproj.io/v1alpha1/clusterworkflowtemplates
- type: argoproj.io/v1alpha1/workfloweventbindings
- type: argoproj.io/v1alpha1/workflows
namespaces:
exclude:
- cicd-pipelines # WARN: too many notifications because of pipelines running
- type: argoproj.io/v1alpha1/workflowtasksets
- type: argoproj.io/v1alpha1/workflowtemplates
## monitoring - DEPRECATED: replace with VictoriaMetrics
- type: monitoring.coreos.com/v1/alertmanagerconfigs
- type: monitoring.coreos.com/v1/alertmanagers
- type: monitoring.coreos.com/v1/podmonitors
- type: monitoring.coreos.com/v1/probes
- type: monitoring.coreos.com/v1/prometheuses
- type: monitoring.coreos.com/v1/prometheusrules
- type: monitoring.coreos.com/v1/servicemonitors
- type: monitoring.coreos.com/v1/thanosrulers
## logging
- type: logging.banzaicloud.io/v1beta1/clusterflows
- type: logging.banzaicloud.io/v1beta1/clusteroutputs
- type: logging.banzaicloud.io/v1beta1/flows
- type: logging.banzaicloud.io/v1beta1/loggings
- type: logging.banzaicloud.io/v1beta1/outputs
- type: logging-extensions.banzaicloud.io/v1alpha1/eventtailers
- type: logging-extensions.banzaicloud.io/v1alpha1/hosttailers
## treafik
- type: traefik.containo.us/v1alpha1/ingressroutes
- type: traefik.containo.us/v1alpha1/ingressroutetcps
- type: traefik.containo.us/v1alpha1/ingressrouteudps
- type: traefik.containo.us/v1alpha1/middlewares
- type: traefik.containo.us/v1alpha1/middlewaretcps
- type: traefik.containo.us/v1alpha1/serverstransports
- type: traefik.containo.us/v1alpha1/tlsoptions
- type: traefik.containo.us/v1alpha1/tlsstores
- type: traefik.containo.us/v1alpha1/traefikservices
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
error-kube-system:
displayName: "kube-system errors"
botkube/kubernetes:
enabled: true
config:
event:
types:
- error
namespaces:
include:
- kube-system
resources:
# K8s
- type: v1/limitranges
- type: v1/resourcequotas
- type: v1/serviceaccounts
- type: policy/v1/poddisruptionbudgets
- type: apiextensions.k8s.io/v1/customresourcedefinitions
- type: rbac.authorization.k8s.io/v1/clusterrolebindings
- type: rbac.authorization.k8s.io/v1/clusterroles
- type: rbac.authorization.k8s.io/v1/rolebindings
- type: rbac.authorization.k8s.io/v1/roles
- type: admissionregistration.k8s.io/v1/mutatingwebhookconfigurations
- type: admissionregistration.k8s.io/v1/validatingwebhookconfigurations
- type: networking.k8s.io/v1/ingresses
- type: networking.k8s.io/v1/networkpolicies
- type: scheduling.k8s.io/v1/priorityclasses
# CRDs
## gcp/gke
- type: cloud.google.com/v1/backendconfigs
- type: cloud.google.com/v1beta1/backendconfigs
# - type: networking.gke.io/v1/frontendconfigs
- type: networking.gke.io/v1/managedcertificates
# - type: networking.gke.io/v1/serviceattachments
# - type: networking.gke.io/v1/servicenetworkendpointgroups
# - type: hub.gke.io/v1/memberships
- type: nodemanagement.gke.io/v1alpha1/updateinfos
## calico
- type: crd.projectcalico.org/v1/bgpconfigurations
- type: crd.projectcalico.org/v1/bgppeers
- type: crd.projectcalico.org/v1/blockaffinities
- type: crd.projectcalico.org/v1/clusterinformations
- type: crd.projectcalico.org/v1/felixconfigurations
- type: crd.projectcalico.org/v1/globalnetworkpolicies
- type: crd.projectcalico.org/v1/globalnetworksets
- type: crd.projectcalico.org/v1/hostendpoints
- type: crd.projectcalico.org/v1/ipamblocks
- type: crd.projectcalico.org/v1/ipamconfigs
- type: crd.projectcalico.org/v1/ipamhandles
- type: crd.projectcalico.org/v1/ippools
- type: crd.projectcalico.org/v1/networkpolicies
- type: crd.projectcalico.org/v1/networksets
context:
rbac:
group:
prefix: ""
static:
values:
- botkube-plugins-default
type: Static
configWatcher:
enabled: true
inCluster:
informerResyncPeriod: 10m
plugins:
cacheDir: /tmp
repositories:
botkube:
url: https://github.com/kubeshop/botkube/releases/download/v1.8.0/plugins-index.yaml
incomingWebhook:
enabled: true
# port and baseInClusterURL are set via envs
restartPolicy:
type: DeactivatePlugin
threshold: 10
healthCheckInterval: 10s
analytics:
disable: false
Slack threads
- https://botkube.slack.com/archives/C01CR1KS55K/p1709121554586339
- https://botkube.slack.com/archives/C01CR1KS55K/p1708685516373879
Versions
Kubernetes: v1.28.4-gke.1083000
Platform: GKE
Botkube: 1.8.0
@mszostok posted this answer in Slack thread https://botkube.slack.com/archives/C01CR1KS55K/p1709121554586339
Thank you for upgrading! I've reviewed your configuration, and I agree with your observation about its similarity to the previous report.
I think that I know what could cause that. The common thing is that you have a lot of dedicated k8s source configurations, and you are also watching a lot of different resources. It's great to see such usage ๐
I believe the root cause is directly connected with changed kubernetes source implementation. In our current approach, a new shared informer is created for each Kubernetes source (see: [source.go#L122]). Unlike the previous method, which had a single informer based on merged all configurations, the current approach results in multiple in-memory informers for each configuration - making it N times.
To address this, I think that we should refactor our approach and introduce a global informer, similar to the one we had in v0.18. This should help resolve the issue.
Could you please create a GitHub issue for this? We'll further refine it and try include it in our next planning ๐ However, since the kubernetes code is fully open source, if you find the time to experiment with refactoring and testing this approach, it would it would be amazing! ๐ If not, we will try to handle it in our team and keep you posted ๐
Hi @bygui86, good news! The issue has been resolved in #1425 which is now merged to the main
branch.
The Botkube 1.10 release is planned for next week but, in the meantime, could you please test Botkube from the latest main
to confirm your issue is gone?
The following command should do the trick:
botkube install --repo https://storage.googleapis.com/botkube-latest-main-charts --version 0.0.0-2cb5eacc # add your own overrides here
In case you use other way for the Helm chart installation, here are the details:
- Helm repo:
https://storage.googleapis.com/botkube-latest-main-charts
- Helm chart version:
0.0.0-2cb5eacc
However, remember to keep the following overrides:
image:
pullPolicy: IfNotPresent
registry: ghcr.io
repository: kubeshop/botkube
tag: 0.0.0-2cb5eacc # use the latest Botkube Agent image
plugins:
repositories:
botkube:
url: https://storage.googleapis.com/botkube-plugins-latest/plugins-index.yaml # use the latest plugins
Hope that will fix your issue and you'll be able to upgrade to the latest Botkube. Cheers!
@pkosiec thanks a lot for your effort! The memory consumption in the MR looks really promising!
Unfortunately I don't have time to test it in the next few days :( I will try to upgrade to v1.10.x when I get a spot :)
Keep you posted!