[BUG] Defining a `namespaceSelector` in a `jspolicy` object causes an infinite create/destroy loop for the resulting `validatingwebhookconfiguration`
spkane opened this issue · comments
This was a tough one...and appears to be a significant bug.
For context, I am deploying jspolicy via ArgoCD (using kustomize's Helm support), and there was some troubleshooting for this issue in the loft.sh Slack.
The conversation can be found here:
We are using the newest v0.2.2 helm chart.
The Issue
Adding a namespaceSelector
to a kind: JsPolicy
object causes an infinite create/destroy loop for the resulting validatingwebhookconfiguration
.
Reproduction Use Case
Create a kind: JsPolicy
manifest that includes a namespaceSelector
, like so:
apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
name: "deny-seccompprofile"
spec:
namespaceSelector:
matchExpressions:
- key: jspolicy-skip-webhooks
operator: DoesNotExist
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
javascript: |
const securityContext = request.object.spec.securityContext || [];
const allowed_namespaces = [
"jspolicy",
"kube-node-lease",
"kube-public",
"kube-system"
]
if (!allowed_namespaces.includes(request.namespace)) {
if (securityContext.seccompProfile) {
deny("spec.securityContext.seccompProfile not allowed")
}
} else {
if (securityContext.seccompProfile) {
warn("spec.securityContext.seccompProfile is being utilized")
}
}
This should create a loop where running something like kubectl get validatingwebhookconfiguration --watch
will result in an endless scroll like this:
deny-seccompprofile-haruh 1 1s
deny-seccompprofile-haruh 1 1s
deny-seccompprofile-haruh 1 1s
deny-seccompprofile-haruh 1 1s
deny-seccompprofile-haruh 1 0s
deny-seccompprofile-haruh 1 0s
deny-seccompprofile-haruh 1 0s
The Workaround
After much investigation, the current workaround is to add an item to the matchExpressions
map.
matchExpressions:
- key: control-plane
operator: DoesNotExist
So, the whole kind: JsPolicy
manifest looks like this:
apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
name: "deny-seccompprofile"
spec:
namespaceSelector:
matchExpressions:
- key: jspolicy-skip-webhooks
operator: DoesNotExist
- key: control-plane
operator: DoesNotExist
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
javascript: |
const securityContext = request.object.spec.securityContext || [];
const allowed_namespaces = [
"jspolicy",
"kube-node-lease",
"kube-public",
"kube-system"
]
if (!allowed_namespaces.includes(request.namespace)) {
if (securityContext.seccompProfile) {
deny("spec.securityContext.seccompProfile not allowed")
}
} else {
if (securityContext.seccompProfile) {
warn("spec.securityContext.seccompProfile is being utilized")
}
}
Summary
It appears like the controller (jspolicy pod) is fighting with the kind: JsPolicy
object. The jspolicy object defines a single namespaceSelector
, yet, the controller wants to add an additional one, which appears to create this infinite loop, unless the kind: JsPolicy
object also defines the additional namespaceSelector
that the operator is expecting.
I think the extra wrinkle might be that it's an Azure controller that is adding the control-plane
matchExpression, not the jspolicy controller. Seems like this issue on cert-manager is more or less about the same problem: cert-manager/cert-manager#4114
The fix here might simply be to clearly document that this is a known issue on Azure and provide people with the workaround of adding the Azure namespaceSelector
in addition to any others that they may want to be defined.