Almost standard helm install results in 500 internal error due to redirect loop when connecting to user notebook
StefanVanDyck opened this issue · comments
Bug description
I tried upgrading my jupyterhub deployment from version 2.0.0 of the helm chart to version 3.2.1.
But I am totally stumped by this redirect error.
I tried to reduce the helm config I use to the absolute simplest form and I cannot seem to get things to work.
I found many people with similar issues, but always when trying to run locally.
Where the suggested solution is to use a different port to connect to.
So maybe there is an issue with how the helm chart ingress is setup?
How to reproduce
- Install jupyterhub helm chart version 3.2.1 (also tried 3.2.0 and 3.1.0)
- Connect to the hub page using the ingress or NodePort
- Spawn a user server
- Get bounced between /user/<username>/lab? and /hub/user/<username>/lab? until error
Your personal set up
Version(s):
- jupyterhub helm chart version: 3.2.1,
- kubernetes version: 1.28.4
- helm version: 3.12.0
Configuration
Helm values:---
hub:
db:
pvc:
storageClassName: ceph-block
proxy:
service:
type: NodePort
ingress:
enabled: true
hosts:
- hub.xxx.xxx.xxx
singleuser:
image:
name: jupyter/datascience-notebook
tag: latest
pullPolicy: Always
cmd: null
profileList:
- display_name: "Minimal environment"
description: "To avoid too much bells and whistles: Python."
default: true
Logs
Hub logs:[I 2023-12-10 09:51:41.110 JupyterHub log:191] 302 GET /hub/ -> /user/stefan/ (stefan@10.212.134.201) 17.23ms
[I 2023-12-10 09:51:41.139 JupyterHub log:191] 302 GET /user/stefan/lab? -> /hub/user/stefan/lab? (@10.212.134.201) 0.67ms
[I 2023-12-10 09:51:41.153 JupyterHub log:191] 302 GET /hub/user/stefan/lab? -> /user/stefan/lab?redirects=1 (stefan@10.212.134.201) 2.55ms
[I 2023-12-10 09:51:41.166 JupyterHub log:191] 302 GET /user/stefan/lab?redirects=1 -> /hub/user/stefan/lab?redirects=1 (@10.212.134.201) 0.39ms
[W 2023-12-10 09:51:41.182 JupyterHub base:1656] Redirect loop detected on /hub/user/stefan/lab?redirects=1
[I 2023-12-10 09:51:42.566 JupyterHub log:191] 200 GET /hub/home (stefan@10.212.134.201) 2.99ms
[I 2023-12-10 09:51:43.183 JupyterHub log:191] 302 GET /hub/user/stefan/lab?redirects=1 -> /user/stefan/lab?redirects=2 (stefan@10.212.134.201) 2001.93ms
[I 2023-12-10 09:51:48.431 JupyterHub log:191] 302 GET /user/stefan/lab? -> /hub/user/stefan/lab? (@10.212.134.201) 0.65ms
[I 2023-12-10 09:51:48.450 JupyterHub log:191] 302 GET /hub/user/stefan/lab? -> /user/stefan/lab?redirects=1 (stefan@10.212.134.201) 2.41ms
[I 2023-12-10 09:51:48.463 JupyterHub log:191] 302 GET /user/stefan/lab?redirects=1 -> /hub/user/stefan/lab?redirects=1 (@10.212.134.201) 0.64ms
[W 2023-12-10 09:51:48.478 JupyterHub base:1656] Redirect loop detected on /hub/user/stefan/lab?redirects=1
[I 2023-12-10 09:51:50.481 JupyterHub log:191] 302 GET /hub/user/stefan/lab?redirects=1 -> /user/stefan/lab?redirects=2 (stefan@10.212.134.201) 2004.18ms
[I 2023-12-10 09:51:50.570 JupyterHub log:191] 302 GET /user/stefan/lab?redirects=2 -> /hub/user/stefan/lab?redirects=2 (@10.212.134.201) 0.65ms
[W 2023-12-10 09:51:50.588 JupyterHub web:1869] 500 GET /hub/user/stefan/lab?redirects=2 (10.212.134.201): Redirect loop detected. Notebook has jupyterhub version unknown (likely < 0.8), but the Hub expects 4.0.2. Try installing jupyterhub==4.0.2 in the user environment if you continue to have problems.
[E 2023-12-10 09:51:50.589 JupyterHub log:183] {
"X-Real-Ip": "10.212.134.201",
"X-Forwarded-Server": "ingress-traefik-6npzb",
"X-Forwarded-Proto": "https,http",
"X-Forwarded-Port": "443,80",
"X-Forwarded-Host": "hub.xxx.xxx.xxx",
"X-Forwarded-For": "10.212.134.201,::ffff:10.200.140.218",
"Upgrade-Insecure-Requests": "1",
"Traceparent": "00-1d52f61eb819e093d82e6f6a714785b7-285cdb4f83c85f85-01",
"Sec-Fetch-User": "?1",
"Sec-Fetch-Site": "same-origin",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Dest": "document",
"Sec-Ch-Ua-Platform": "\"Linux\"",
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua": "\"Chromium\";v=\"118\", \"Google Chrome\";v=\"118\", \"Not=A?Brand\";v=\"99\"",
"Referer": "https://hub.xxx.xxx.xxx/hub/home",
"Cookie": "_xsrf=[secret]; jupyterhub-hub-login=[secret]; jupyterhub-session-id=[secret]",
"Accept-Language": "en-US,en;q=0.9,nl;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36",
"Host": "hub.xxx.xxx.xxx",
"Connection": "keep-alive"
}
[E 2023-12-10 09:51:50.590 JupyterHub log:191] 500 GET /hub/user/stefan/lab?redirects=2 (stefan@10.212.134.201) 3.38ms
User notebook:
[I 2023-12-10 09:32:19.545 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2023-12-10 09:32:19.545 LabApp] Extension Manager is 'pypi'.
[I 2023-12-10 09:32:19.547 ServerApp] jupyterlab | extension was successfully loaded.
[I 2023-12-10 09:32:19.551 ServerApp] jupyterlab_git | extension was successfully loaded.
[I 2023-12-10 09:32:19.554 ServerApp] nbclassic | extension was successfully loaded.
[I 2023-12-10 09:32:19.605 ServerApp] nbdime | extension was successfully loaded.
[I 2023-12-10 09:32:19.608 ServerApp] notebook | extension was successfully loaded.
[I 2023-12-10 09:32:19.608 ServerApp] Serving notebooks from local directory: /home/jovyan
[I 2023-12-10 09:32:19.608 ServerApp] Jupyter Server 2.8.0 is running at:
[I 2023-12-10 09:32:19.608 ServerApp] http://jupyter-stefan:8888/user/stefan/lab?token=...
[I 2023-12-10 09:32:19.608 ServerApp] http://127.0.0.1:8888/user/stefan/lab?token=...
[I 2023-12-10 09:32:19.608 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2023-12-10 09:32:20.219 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.200.171.251) 0.56ms
[I 2023-12-10 09:32:20.445 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server
[I 2023-12-10 09:32:26.483 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.212.134.201) 0.85ms
[I 2023-12-10 09:32:40.448 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.212.134.201) 0.86ms
[I 2023-12-10 09:32:47.409 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.212.134.201) 0.82ms
[I 2023-12-10 09:51:41.125 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.212.134.201) 0.79ms
[I 2023-12-10 09:51:48.415 ServerApp] 302 GET /user/stefan/ -> /user/stefan/lab? (@10.212.134.201) 0.81ms
Configurabe Http Proxy:
09:32:01.747 [ConfigProxy] info: 200 GET /api/routes
09:32:20.221 [ConfigProxy] info: Adding route /user/stefan -> http://10.200.140.214:8888
09:32:20.221 [ConfigProxy] info: Route added /user/stefan -> http://10.200.140.214:8888
09:32:20.221 [ConfigProxy] info: 201 POST /api/routes/user/stefan
09:33:01.751 [ConfigProxy] info: 200 GET /api/routes
Is this a bug, or am I simply missing some combination of config values?
Any help would be greatly appreciated.
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋
Welcome to the Jupyter community! 🎉
Ah breakthrough, port-forwarding port 8000 of the proxy pod does allow me to access it using localhost:8000.
So I suppose the problem is the port used in the proxy-public
service is not correct?
I managed to get things to work by manually editing the kubernetes resources.
I changed the proxy-public
service to target port 8000 and had to add an additional selector label that I also added to the proxy deployment.
But I am not sure if this is the intended deployment or if I am missing the idea behind the current setup.
Ok, so I thing something like this might be needed to make the public service work correctly.
52bd1e2
Is your ingress controller pods labelled to be allowed network access to the proxy pod? They are sending traffic to the proxy pod - so they need the label if you have network policy enforcement in your k8s cluster.
Was the user server restarted since the upgrade - or left running since before?
Make sure you have read the changelog for 3.0.0 as well before continuing if you havnt, i've forgotten what was breaking so i cant rule out something there is of interest.
/ From a mobile device
@consideRatio Thanks for having a look.
I tried restarting the user notebook, but the behaviour is the same.
I think the problem is the selector for the public proxy service is not specific to the Proxy pods and picks up the hub pods too. (But could definitely be missing a trick)
I did a complete clean install with the config above, which should be acceptable according to the schema of 3.2.1.
Also I don't believe I have network policy enforcement enabled ( yet, you know how it is :) )
The config doesnt make sense to me, you have both ingress and procy.service.type nodeport. That allows flows in two ways, one directly thorugh the node port to the proxy pod and onwards, and one via an ingress controller.
Are you using an ingress controller? When using an incress controller, you wont need proxy service type nodePort and can use ClusterIP instead for the proxy service, which is proxied to by the ingress controller.
@consideRatio Yeah, I used NodePort to try and debug the issue without going through my traefik ingress.
The Nodeport has the same issue when I connect to it directly.
Indeed the intention is to use ClusterIP.
I managed to get things working by applying these patches using ansible:
- name: Fix jupyterhub public proxy deployment
kubernetes.core.k8s:
state: patched
kind: Deployment
name: jupyterhub-proxy
namespace: jupyterhub
definition:
spec:
template:
metadata:
labels:
hub.jupyter.org/network-selector: proxy
become: true
- name: Fix jupyterhub public proxy service
kubernetes.core.k8s:
state: patched
kind: Service
name: jupyterhub-proxy-public
namespace: jupyterhub
definition:
spec:
selector:
hub.jupyter.org/network-selector: proxy
become: true
I couldn't find a way to get soemthing like this to work with the current helm config / templating
What was the labels on the pod template patched, and the labels on the service selector patched before and after?
If they dont target the right pod, its quite weird.
the public proxy service:
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: jupyterhub
meta.helm.sh/release-namespace: jupyterhub
creationTimestamp: "2023-12-10T11:44:27Z"
labels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
name: jupyterhub-proxy-public
namespace: jupyterhub
resourceVersion: "1594693"
uid: 5f2d5bda-76f5-4bf1-b128-ee20f12dbf88
spec:
clusterIP: 10.201.188.121
clusterIPs:
- 10.201.188.121
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
the proxy deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "3"
meta.helm.sh/release-name: jupyterhub
meta.helm.sh/release-namespace: jupyterhub
creationTimestamp: "2023-12-10T11:24:11Z"
generation: 3
labels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
name: jupyterhub-proxy
namespace: jupyterhub
resourceVersion: "1624737"
uid: af341ffb-1b6f-412c-920d-0f1ec8ea0b58
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
strategy:
type: Recreate
template:
metadata:
annotations:
checksum/auth-token: 56bd
checksum/proxy-secret: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
creationTimestamp: null
labels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
hub.jupyter.org/network-access-hub: "true"
hub.jupyter.org/network-access-singleuser: "true"
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- core
weight: 100
containers:
- command:
- configurable-http-proxy
- --ip=
- --api-ip=
- --api-port=8001
- --default-target=http://jupyterhub-hub:$(JUPYTERHUB_HUB_SERVICE_PORT)
- --error-target=http://jupyterhub-hub:$(JUPYTERHUB_HUB_SERVICE_PORT)/hub/error
- --port=8000
env:
- name: CONFIGPROXY_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: hub.config.ConfigurableHTTPProxy.auth_token
name: jupyterhub-hub
image: quay.io/jupyterhub/configurable-http-proxy:4.6.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 30
httpGet:
path: /_chp_healthz
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
name: chp
ports:
- containerPort: 8000
name: http
protocol: TCP
- containerPort: 8001
name: api
protocol: TCP
readinessProbe:
failureThreshold: 1000
httpGet:
path: /_chp_healthz
port: http
scheme: HTTP
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
allowPrivilegeEscalation: false
runAsGroup: 65534
runAsUser: 65534
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: deltaray-docker-hub-secret
priorityClassName: jupyterhub
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoSchedule
key: hub.jupyter.org/dedicated
operator: Equal
value: core
- effect: NoSchedule
key: hub.jupyter.org_dedicated
operator: Equal
value: core
The hub deployment which I believe also matches the service selector:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "11"
meta.helm.sh/release-name: jupyterhub
meta.helm.sh/release-namespace: jupyterhub
creationTimestamp: "2023-12-09T19:28:20Z"
generation: 11
labels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
name: jupyterhub-hub
namespace: jupyterhub
resourceVersion: "1624843"
uid: 11ec00aa-8fc6-49e0-9e8f-856312f5d1c4
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
strategy:
type: Recreate
template:
metadata:
annotations:
checksum/config-map: 056097a9b118be9539dfb219fce09af7a03ef153c66cf1d60c021c06e14c4f53
checksum/secret: 53b59084d26e8c82f643a4e999671af9392aa09c1233bd1eeedc081585334261
creationTimestamp: null
labels:
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: jupyterhub
app.kubernetes.io/version: 4.0.2
helm.sh/chart: jupyterhub-3.2.1
hub.jupyter.org/network-access-proxy-api: "true"
hub.jupyter.org/network-access-proxy-http: "true"
hub.jupyter.org/network-access-singleuser: "true"
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- core
weight: 100
containers:
- args:
- jupyterhub
- --config
- /usr/local/etc/jupyterhub/jupyterhub_config.py
- --upgrade-db
env:
- name: PYTHONUNBUFFERED
value: "1"
- name: HELM_RELEASE_NAME
value: jupyterhub
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: CONFIGPROXY_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: hub.config.ConfigurableHTTPProxy.auth_token
name: jupyterhub-hub
- name: JUPYTERHUB_OAUTH2_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: secret
name: jupyterhub-oauth2-client
image: quay.io/jupyterhub/k8s-hub:3.2.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 30
httpGet:
path: /hub/health
port: http
scheme: HTTP
initialDelaySeconds: 300
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
name: hub
ports:
- containerPort: 8081
name: http
protocol: TCP
readinessProbe:
failureThreshold: 1000
httpGet:
path: /hub/health
port: http
scheme: HTTP
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
allowPrivilegeEscalation: false
runAsGroup: 1000
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/local/etc/jupyterhub/jupyterhub_config.py
name: config
subPath: jupyterhub_config.py
- mountPath: /usr/local/etc/jupyterhub/z2jh.py
name: config
subPath: z2jh.py
- mountPath: /usr/local/etc/jupyterhub/config/
name: config
- mountPath: /usr/local/etc/jupyterhub/secret/
name: secret
- mountPath: /etc/ssl/certs/ca-certificates.crt
name: certificates
readOnly: true
subPath: ca-certificates.crt
- mountPath: /srv/jupyterhub
name: pvc
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: deltaray-docker-hub-secret
priorityClassName: jupyterhub
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
serviceAccount: jupyterhub-hub
serviceAccountName: jupyterhub-hub
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: hub.jupyter.org/dedicated
operator: Equal
value: core
- effect: NoSchedule
key: hub.jupyter.org_dedicated
operator: Equal
value: core
volumes:
- configMap:
defaultMode: 420
name: jupyterhub-hub
name: config
- name: secret
secret:
defaultMode: 420
secretName: jupyterhub-hub
- hostPath:
path: /etc/ssl/certs/
type: ""
name: certificates
- name: pvc
persistentVolumeClaim:
claimName: jupyterhub-hub-db-dir
Hmmmm, these labels like app.kubernetes.io/instance are not what i expect to see. Is this really a default installation?
This chart use old label naming, "app: jupyterhub" etc
@consideRatio Oh no you are entirely correct.
I am an absolute idiot...
I install jupyterhub as a subchart to a custom chart I helpfully called "jupyterhub".
It contains the standard _helpers generated by running helm create
.
The helpers contain definitions for things like jupyterhub.labels, jupyterhub.selectors, etc. ....
Which in turn completely overrides some of the definitions used by the actual jupyterhub chart.
Thank you so much for you help! This thing had me tearing my hair out...
Hope I did not waste too much of your time.
Ahh, it happens! I'm glad you got it resolved and thank you for following up on the resolution ❤️ 🌻 🎉