nolar / kopf

A Python framework to write Kubernetes operators in just a few lines of code

Home Page:https://kopf.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handle large resource spec being annotated to `last-handled-configuration`

Bharat23 opened this issue · comments

Keywords

annotation, spec, last-handled-configuration, finalizers

Problem

When dealing with a resource that has a large spec, the operator is throwing an error

APIClientError('Pod \"<REDACTED>\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes'

This is due to the fact that it is trying to store the spec in the last-handled-configuration annotation.
It is especially problematic during deletion as it blocks the deletion using the finalizer, which never gets removed. So, how does one go about configuring the operator to handle such a scenario?

Note:
As per the doc Troubleshooting
Restarting the operator doesn't solve the problem. Only running this command unblocks it.

kubectl patch kopfexample kopf-example-1 -p '{"metadata": {"finalizers": []}}' --type merge

wow this is unfortunate, literally the first lines of code I write with kopf and I hit this error:

import kopf
import logging

@kopf.on.create("secret")
def create_fn(body, **kwargs):
    logging.info(f"A handler is called with body")
kopf run -n database kopftest.py
[2024-02-16 09:26:51,554] kopf._core.engines.a [INFO    ] Initial authentication has been initiated.
[2024-02-16 09:26:51,556] kopf.activities.auth [INFO    ] Activity 'login_with_kubeconfig' succeeded.
[2024-02-16 09:26:51,557] kopf._core.engines.a [INFO    ] Initial authentication has finished.
[2024-02-16 09:26:51,735] root                 [INFO    ] A handler is called with body
[2024-02-16 09:26:51,735] kopf.objects         [INFO    ] [database/sh.helm.release.v1.crunchy-postgres-operator.v1] Handler 'create_fn' succeeded.
[2024-02-16 09:26:51,736] kopf.objects         [INFO    ] [database/sh.helm.release.v1.crunchy-postgres-operator.v1] Creation is processed: 1 succeeded; 0 failed.
[2024-02-16 09:26:51,780] kopf.objects         [ERROR   ] [database/sh.helm.release.v1.crunchy-postgres-operator.v1] Throttling for 1 seconds due to an unexpected error: APIClientError('Secret "sh.helm.release.v1.crunchy-postgres-operator.v1" is invalid: metadata.annotations: Too long: must have at most 262144 bytes', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'Secret "sh.helm.release.v1.crunchy-postgres-operator.v1" is invalid: metadata.annotations: Too long: must have at most 262144 bytes', 'reason': 'Invalid', 'details': {'name': 'sh.helm.release.v1.crunchy-postgres-operator.v1', 'kind': 'Secret', 'causes': [{'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}]}, 'code': 422})
Traceback (most recent call last):
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
    response.raise_for_status()
  File "/nix/store/xrgwijximlapr1rkwwwnirf510h3kvc3-python3.11-aiohttp-3.9.1/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1059, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 422, message='Unprocessable Entity', url=URL('https://10.9.9.120:6443/api/v1/namespaces/database/secrets/sh.helm.release.v1.crunchy-postgres-operator.v1')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_core/actions/throttlers.py", line 44, in throttled
    yield should_run
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_core/reactor/processing.py", line 130, in process_resource_event
    applied = await application.apply(
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_core/actions/application.py", line 60, in apply
    await patch_and_check(
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_core/actions/application.py", line 131, in patch_and_check
    resulting_body = await patching.patch_obj(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/patching.py", line 47, in patch_obj
    patched_body = await api.patch(
                   ^^^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 155, in patch
    response = await request(
               ^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
    return await fn(*args, **kwargs, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 85, in request
    await errors.check_response(response)  # but do not parse it!
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/a7b1cwfh2x4i757srdds0jddzwnxv1l2-python3.11-kopf-1.37.1/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
    raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIClientError: ('Secret "sh.helm.release.v1.crunchy-postgres-operator.v1" is invalid: metadata.annotations: Too long: must have at most 262144 bytes', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'Secret "sh.helm.release.v1.crunchy-postgres-operator.v1" is invalid: metadata.annotations: Too long: must have at most 262144 bytes', 'reason': 'Invalid', 'details': {'name': 'sh.helm.release.v1.crunchy-postgres-operator.v1', 'kind': 'Secret', 'causes': [{'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}, {'reason': 'FieldValueTooLong', 'message': 'Too long: must have at most 262144 bytes', 'field': 'metadata.annotations'}]}, 'code': 422})
[2024-02-16 09:26:52,781] kopf.objects         [INFO    ] [database/sh.helm.release.v1.crunchy-postgres-operator.v1] Throttling is over. Switching back to normal operations.
^C[2024-02-16 09:26:52,830] kopf._core.reactor.r [INFO    ] Signal SIGINT is received. Operator is stopping.

Kopf is designed to work "out of the box" for the majority of cases, i.e. of typically small resources — without any special configuration or external dependencies.

For this, it uses annotations to store its per-object state. However, if Kubernetes prohibits storing big values, Kopf can do nothing here. I am open to suggestions though.

What an operator developer can do, is to configure their own state storage (2 storages of 2 types, actually), as documented here:

It can be e.g. a MySQL/Postgres/any other AWS RDS database, or even a filesystem (if the volume is persisted and shared across multiple instanced of Kopf or on restarts). All of this requires some extra configuration to start an operator and therefore is not the default "out of the box" setup.

There is also the Kopf-provided storage to store the state in the status field — but since a few years ago, it requires that the CRD schema explicitly declares those fields as allowed to store any arbitrary value with x-kubernetes-preserve-unknown-fields: true. So it is not the default either (but it was back then). See the blue note section in the link above.