nolar / kopf

A Python framework to write Kubernetes operators in just a few lines of code

Home Page:https://kopf.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

program does not exit on Forbidden exception

axel7083 opened this issue · comments

Long story short

When passing recovery tests (overload the Kubernetes API) it makes the kopf._cogs.clients having ConnectionRefusedError.
It handles it well most of the time. Most of the time it fixed itself when the api is available again.

However in some case it crashed. We jump for ConnectionRefusedError (api not available) to a Forbidden 403 (for unknown reason, probably the RBAC not fully available yet). And the pod just become idle, without receiving anything, or updating anything.

Restarting the pod is the only way to fix it. However, it needs to be done manually which is a problem.

Is it possible to detect this case to force the pod to restart ? Or handling this case to force the exit of the pod.

Kopf version

1.35.3

Kubernetes version

1.25.8

Python version

python:3.9-slim

Code

https://github.com/zakkg3/ClusterSecret/blob/master/src/handlers.py

Logs

[2023-06-27 07:11:33,544] kopf._core.engines.a [INFO    ] Initial authentication has been initiated.
[2023-06-27 07:11:33,546] kopf.activities.auth [INFO    ] Activity 'login_via_client' succeeded.
[2023-06-27 07:11:33,546] kopf._core.engines.a [INFO    ] Initial authentication has finished.
[2023-06-27 07:11:33,575] kopf._core.reactor.o [WARNING ] Not enough permissions to watch for resources: changes (creation/deletion/updates) will not be noticed; the resources are only refreshed on operator restarts.
... everything is okey
[2023-06-27 07:27:29,418] kopf._cogs.clients.w [ERROR   ] Request attempt #1/9 failed; will retry: GET https://{IP}:443/api/v1/namespaces?watch=true&resourceVersion=43102826 -> ClientConnectorError(ConnectionKey(host='{IP}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{IP}', 443)"))
[2023-06-27 07:27:29,418] kopf._cogs.clients.w [ERROR   ] Request attempt #1/9 failed; will retry: GET https://{IP}:443/apis/clustersecret.io/v1/clustersecrets?watch=true&resourceVersion=43103463 -> ClientConnectorError(ConnectionKey(host='{IP}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{IP}', 443)"))
... everything is okey

[2023-06-27 15:08:24,519] kopf._cogs.clients.w [ERROR   ] Request attempt #5/9 failed; will retry: GET https://{ip}:443/api/v1/namespaces?watch=true&resourceVersion=43323169 -> ClientConnectorError(ConnectionKey(host='{ip}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{ip}', 443)"))
[2023-06-27 15:08:31,656] kopf._core.reactor.o [ERROR   ] Watcher for namespaces.v1@none has failed: ('namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', 'reason': 'Forbidden', 'details': {'kind': 'namespaces'}, 'code': 403})
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
    response.raise_for_status()
  File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1004, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://{ip}:443/api/v1/namespaces?watch=true&resourceVersion=43323169')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 107, in guard
    await coro
  File "/usr/local/lib/python3.9/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
    async for raw_event in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
    async for raw_event in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 186, in continuous_watch
    async for raw_input in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 251, in watch_objs
    async for raw_input in api.stream(
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 200, in stream
    response = await request(
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
    return await fn(*args, **kwargs, context=context)
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 85, in request
    await errors.check_response(response)  # but do not parse it!
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
    raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', 'reason': 'Forbidden', 'details': {'kind': 'namespaces'}, 'code': 403})
[2023-06-27 15:08:31,659] kopf._core.reactor.o [ERROR   ] Watcher for clustersecrets.v1.clustersecret.io@none has failed: ('clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', 'reason': 'Forbidden', 'details': {'group': 'clustersecret.io', 'kind': 'clustersecrets'}, 'code': 403})
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
    response.raise_for_status()
  File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1004, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://{ip}:443/apis/clustersecret.io/v1/clustersecrets?watch=true&resourceVersion=43324583')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 107, in guard
    await coro
  File "/usr/local/lib/python3.9/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
    async for raw_event in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
    async for raw_event in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 186, in continuous_watch
    async for raw_input in stream:
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 251, in watch_objs
    async for raw_input in api.stream(
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 200, in stream
    response = await request(
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
    return await fn(*args, **kwargs, context=context)
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 85, in request
    await errors.check_response(response)  # but do not parse it!
  File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
    raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', 'reason': 'Forbidden', 'details': {'group': 'clustersecret.io', 'kind': 'clustersecrets'}, 'code': 403})
Idle state

Additional information

No response

I have add the Kubernetes probing and will re-run some tests.