"unknown namespace for the cache" error
madianas21 opened this issue · comments
What steps did you take and what happened:
Deployed Starboard to an EKS cluster with Helm chart:
helm install starboard-operator aqua/starboard-operator
--namespace starboard-system
--version 0.9.0
Used these values overrides. Did not include namespaces that start with kube-*
in targetNamespaces
value
targetNamespaces: chartmuseum,ingress-nginx,xxx,all-namespaces-except-kube-*
trivy:
githubToken: ghp_xxx
ignoreUnfixed: true
resources:
limits:
cpu: 1000m
memory: 1000M
requests:
cpu: 100m
memory: 100M
These CRD's show up properly:
- VulnerabilityReports
- ConfigAuditReports
- CISKubeBench(had to create a PSP for this to work)
These CRD's do not show up:
- ClusterVulnerabilityReports
- ClusterConfigReports
- KubeHunterReports
This is what I see in the logs every 15 minutes:
{
"level": "error",
"ts": 1643271248.4945095,
"logger": "controller.replicaset",
"msg": "Reconciler error",
"reconciler group": "apps",
"reconciler kind": "ReplicaSet",
"name": "coredns-67bdfc458",
"namespace": "kube-system",
"error": "getting ReplicaSet from cache: unable to get: kube-system/coredns-67bdfc458 because of unknown namespace for the cache",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:227"
}
{
"level": "error",
"ts": 1643269599.7460723,
"logger": "controller.daemonset",
"msg": "Reconciler error",
"reconciler group": "apps",
"reconciler kind": "DaemonSet",
"name": "kube-proxy",
"namespace": "kube-system",
"error": "getting DaemonSet from cache: unable to get: kube-system/kube-proxy because of unknown namespace for the cache",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:227"
}
{
"level": "error",
"ts": 1643271248.3349278,
"logger": "controller.daemonset",
"msg": "Reconciler error",
"reconciler group": "apps",
"reconciler kind": "DaemonSet",
"name": "aws-node",
"namespace": "kube-system",
"error": "getting DaemonSet from cache: unable to get: kube-system/aws-node because of unknown namespace for the cache",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:227"
}
What did you expect to happen:
I expected Starboard not to scan anything in kube-system
namespaces, since I did not include it in targetNamespaces
field. Not sure if this is related but I expected KubeHunterReports CRD's to show up.
Anything else you would like to add:
Was getting 401 api errors, added gitHubToken to solve that.
Also, was getting OOMKilled errors, so had to increase memory limit.
Not sure if this is related, but I get this 401 Unauthorized: Not Authorized
error in logs as well:
{
"level": "error",
"ts": 1643271245.3114705,
"logger": "reconciler.vulnerabilityreport",
"msg": "Scan job container",
"job": "starboard-system/scan-vulnerabilityreport-5c6fbd84c",
"container": "monitoring",
"status.reason": "Error",
"status.message": "2022-01-27T08:14:04.387Z\t\u001b[31mFATAL\u001b[0m\tscan error: unable to initialize a scanner: unable to initialize a docker scanner: 3 errors occurred:\n\t* unable to inspect the image (44<REDACTED>8.dkr.ecr.eu-central-1.amazonaws.com/prometheus-operator:v0.50.0): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\n\t* unable to initialize Podman client: no podman socket found: stat podman/podman.sock: no such file or directory\n\t* GET https://44<REDACTED>8.dkr.ecr.eu-central-1.amazonaws.com/v2/prometheus-operator/manifests/v0.50.0: unexpected status code 401 Unauthorized: Not Authorized\n\n\n\n",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/reconcile.Func.Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/reconcile/reconcile.go:102\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.3/pkg/internal/controller/controller.go:227"
}
Environment:
- Starboard version (use
starboard version
): 0.9.0 - Kubernetes version (use
kubectl version
): v1.21.5-eks-9017834 - OS (macOS 10.15, Windows 10, Ubuntu 19.10 etc): Ubuntu 21.04
Thank you for the feedback @madianas21 This issue is touching on many topic, but to start with please provide first few lines printed to the Starboard Operator pod logs to confirm how you effectively configured target namespaces. In my environment, when I specify only default
, foo
, and bar
namespaces it works as expected, i.e. pods in the kube-system
namespace (and other namespaces) are not scanned.
{"level":"info","ts":1644933223.5281699,"logger":"main","msg":"Starting operator","buildInfo":{"Version":"0.14.0","Commit":"4aa2a11c0c5e6362c1ce04d736ec2fd1997bdaf8","Date":"2022-01-20T11:03:24Z","Executable":""}}
{"level":"info","ts":1644933223.528522,"logger":"operator","msg":"Resolved install mode","install mode":"MultiNamespace","operator namespace":"starboard-system","target namespaces":["default","foo","bar"]}
{"level":"info","ts":1644933223.5289898,"logger":"operator","msg":"Constructing client cache","namespaces":["default","foo","bar","starboard-system",""]}
{"level":"info","ts":1644933224.1323795,"logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1644933224.1441028,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"Service"}
{"level":"info","ts":1644933224.144182,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"ConfigMap"}
{"level":"info","ts":1644933224.1441875,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"Role"}
{"level":"info","ts":1644933224.1441896,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"RoleBinding"}
{"level":"info","ts":1644933224.144196,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"ClusterRole"}
{"level":"info","ts":1644933224.1441982,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"ClusterRoleBinding"}
{"level":"info","ts":1644933224.1441998,"logger":"reconciler.configauditreport","msg":"Skipping unsupported kind","pluginName":"Polaris","kind":"CustomResourceDefinition"}
{"level":"info","ts":1644933224.144645,"logger":"operator","msg":"Starting controllers manager"}
{"level":"info","ts":1644933224.1449256,"msg":"Starting metrics server","path":"/metrics"}
Regarding CRDs I don't know exactly what you mean by "show up" and what you expect, but:
- KubeHunterReport won't be generated by the operator. As mentioned in the documentation, kube-hunter is only integrated with Starboard CLI
- ClusterVulnerabilityReports - is not used by the Starboard Operator. (It's just installed by Helm which installs everything from the deploy/crd directory
- ClusterConfigAuditReport - is only generated if you use Conftest plugin and you have Rego policy that's relevant for cluster-scoped resources, e.g. ClusterRoles or CRDs. (With Polaris you'd see only ConfigAuditReports generated for K8s workloads.)
Regarding other minor issues you mentioned, e.g. GitHub token, OOMKill or 401 errors please provide more details, logs, reproduction steps, and expectations. (Ideally as separate issues.) Just mentioning all the problems along with namespace configuration issues does not allows us to provide useful feedback. Not to mention that closing this issue would be very hard.