program does not exit on Forbidden exception
axel7083 opened this issue · comments
Long story short
When passing recovery tests (overload the Kubernetes API) it makes the kopf._cogs.clients
having ConnectionRefusedError.
It handles it well most of the time. Most of the time it fixed itself when the api is available again.
However in some case it crashed. We jump for ConnectionRefusedError (api not available) to a Forbidden 403 (for unknown reason, probably the RBAC not fully available yet). And the pod just become idle, without receiving anything, or updating anything.
Restarting the pod is the only way to fix it. However, it needs to be done manually which is a problem.
Is it possible to detect this case to force the pod to restart ? Or handling this case to force the exit of the pod.
Kopf version
1.35.3
Kubernetes version
1.25.8
Python version
python:3.9-slim
Code
https://github.com/zakkg3/ClusterSecret/blob/master/src/handlers.py
Logs
[2023-06-27 07:11:33,544] kopf._core.engines.a [INFO ] Initial authentication has been initiated.
[2023-06-27 07:11:33,546] kopf.activities.auth [INFO ] Activity 'login_via_client' succeeded.
[2023-06-27 07:11:33,546] kopf._core.engines.a [INFO ] Initial authentication has finished.
[2023-06-27 07:11:33,575] kopf._core.reactor.o [WARNING ] Not enough permissions to watch for resources: changes (creation/deletion/updates) will not be noticed; the resources are only refreshed on operator restarts.
... everything is okey
[2023-06-27 07:27:29,418] kopf._cogs.clients.w [ERROR ] Request attempt #1/9 failed; will retry: GET https://{IP}:443/api/v1/namespaces?watch=true&resourceVersion=43102826 -> ClientConnectorError(ConnectionKey(host='{IP}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{IP}', 443)"))
[2023-06-27 07:27:29,418] kopf._cogs.clients.w [ERROR ] Request attempt #1/9 failed; will retry: GET https://{IP}:443/apis/clustersecret.io/v1/clustersecrets?watch=true&resourceVersion=43103463 -> ClientConnectorError(ConnectionKey(host='{IP}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{IP}', 443)"))
... everything is okey
[2023-06-27 15:08:24,519] kopf._cogs.clients.w [ERROR ] Request attempt #5/9 failed; will retry: GET https://{ip}:443/api/v1/namespaces?watch=true&resourceVersion=43323169 -> ClientConnectorError(ConnectionKey(host='{ip}', port=443, is_ssl=True, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=-4082323174033603600), ConnectionRefusedError(111, "Connect call failed ('{ip}', 443)"))
[2023-06-27 15:08:31,656] kopf._core.reactor.o [ERROR ] Watcher for namespaces.v1@none has failed: ('namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', 'reason': 'Forbidden', 'details': {'kind': 'namespaces'}, 'code': 403})
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
response.raise_for_status()
File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1004, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://{ip}:443/api/v1/namespaces?watch=true&resourceVersion=43323169')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 107, in guard
await coro
File "/usr/local/lib/python3.9/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
async for raw_event in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
async for raw_event in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 186, in continuous_watch
async for raw_input in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 251, in watch_objs
async for raw_input in api.stream(
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 200, in stream
response = await request(
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'namespaces is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "namespaces" in API group "" at the cluster scope', 'reason': 'Forbidden', 'details': {'kind': 'namespaces'}, 'code': 403})
[2023-06-27 15:08:31,659] kopf._core.reactor.o [ERROR ] Watcher for clustersecrets.v1.clustersecret.io@none has failed: ('clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', 'reason': 'Forbidden', 'details': {'group': 'clustersecret.io', 'kind': 'clustersecrets'}, 'code': 403})
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
response.raise_for_status()
File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1004, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://{ip}:443/apis/clustersecret.io/v1/clustersecrets?watch=true&resourceVersion=43324583')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 107, in guard
await coro
File "/usr/local/lib/python3.9/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
async for raw_event in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
async for raw_event in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 186, in continuous_watch
async for raw_input in stream:
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/watching.py", line 251, in watch_objs
async for raw_input in api.stream(
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 200, in stream
response = await request(
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'clustersecrets.clustersecret.io is forbidden: User "system:serviceaccount:clustersecret:clustersecret-account" cannot watch resource "clustersecrets" in API group "clustersecret.io" at the cluster scope', 'reason': 'Forbidden', 'details': {'group': 'clustersecret.io', 'kind': 'clustersecrets'}, 'code': 403})
Idle state
Additional information
No response
I have add the Kubernetes probing and will re-run some tests.