strimzi / kafka-access-operator

Operator for sharing access to Strimzi clusters across namespaces

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error reconciling KafkaAccess resources : the server rejected our request due to an error in our request

lburgazzoli opened this issue · comments

I've followed the instruction about how to install the strimzi kafka operator and to create a kafka instance that can be found here https://strimzi.io/quickstarts.

➜ kubectl get pods                                   
NAME                                          READY   STATUS    RESTARTS   AGE
my-cluster-entity-operator-64dc7c8844-jnh7q   3/3     Running   0          116m
my-cluster-kafka-0                            1/1     Running   0          117m
my-cluster-zookeeper-0                        1/1     Running   0          117m
strimzi-cluster-operator-95d88f6b5-724tq      1/1     Running   0          142m

➜ kubectl get kafkas                                 
NAME         DESIRED KAFKA REPLICAS   DESIRED ZK REPLICAS   READY   WARNINGS
my-cluster   1                        1                     True    True

Then I've started the access operator from the main branch:

  1. I've installed the KafkaAcces CRD from packaging/install/040-Crd-kafkaaccess.yaml
  2. I'v started the operator with mvn compile exec:java -Dexec.mainClass="io.strimzi.kafka.access.KafkaAccessOperator"
  3. I've create a KafkaAccess CR from packaging/examples/kafka-access.yaml

The logs of the operator then shows something like:

Failure executing: PATCH at: https://192.168.49.2:8443/apis/access.strimzi.io/v1alpha1/namespaces/kafka/kafkaaccesses/my-kafka-access/status. 
Message: the server rejected our request due to an error in our request. 
Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={})

The failure seems to be cause by the fact that the the Kubernetes API server will not recursively create nested objects for a JSON patch input.

In fact, once created, the resource does not have a status (as expected):

apiVersion: access.strimzi.io/v1alpha1
kind: KafkaAccess
metadata:
  creationTimestamp: "2023-10-19T11:54:46Z"
  generation: 1
  name: my-kafka-access
  namespace: kafka
  resourceVersion: "16747"
  uid: 99f6ffe3-0f99-42b2-b34a-6932ccfc9989
spec:
  kafka:
    listener: plain
    name: my-cluster
    namespace: kafka

And JOSDK generates the following patch:

[
    {
        "op": "add",
        "path": "/status/observedGeneration",
        "value": 1
    },
    {
        "op": "add",
        "path": "/status/binding",
        "value": {
            "name": "my-kafka-access"
        }
    },
    {
        "op": "add",
        "path": "/status/conditions",
        "value": [
            {
                "type": "Ready",
                "status": "True",
                "lastTransitionTime": "2023-10-19T11:54:46.401372687Z",
                "reason": "Ready",
                "message": "Ready"
            }
        ]
    }
]

By replacing UpdateControl.patchStatus(...) with UpdateControl.updateStatus(...) in the reconcile loop, the issue seems to be fixed. Given that at this stage the CR status is quite small, updating vs patching should not be a big concern.

@katheris I can submit a PR with this fix if that make sense for you.

@lburgazzoli thanks for investigating this. I've managed to reproduce the error and am happy for you to submit a PR with your proposed fix. It seems in my testing because I always started with a working Kafka Access that included an existing status I didn't see the error.