dask / dask-docker

Docker images for dask

Home Page:https://hub.docker.com/u/daskdev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add new image for Dask hub

jacobtomlinson opened this issue · comments

In order for a Docker image to be used with the Daskhub helm chart it needs dask-gateway and jupyterhub-singleuser to be installed.

Neither of our official images have those packages so I propose we either add them to the notebook image or create a new image specifically for Daskhub.

Sounds good! I have a slight preference for creating a new image, but you probably have a better sense of what makes sense here

More may be needed than simply adding those packages. I tried to roll my own images and it didn't work - I get 404s. If I use the latest dask-notebook, I see the same -- even when installing dask-gateway and jupyterhub-singleuser

Below is an example config that I installed on a local minikube, including those packages. (If you try to repro, don't use the latest k8s v1.22, which is starting to become the default download -- daskhub chart is not compatible). I also include below the logs from the hub and the user pod.

To note:

  • user pod gives 404 for /user/<username> url at the very end of the logs
  • hub logs claim Single-user server has no version header, which means it is likely < 0.8. Expected 1.4.2, though you can clearly see jupyterhub-singleuser 1.4.2 being installed on the user pod.
  • hub logs show 404 for all static files (e.g. css)

Adding @consideRatio , as this feels a bit like a jhub configuration update is needed, as well. I'd be interested to know what that might be, as I'm not sure how to get more information about what is going wrong.

$ cat config.yaml 
jupyterhub:
  proxy:
    service:
      type: NodePort
  singleuser:
    image:
      name: daskdev/dask-notebook
      tag: 2021.9.0
      pullPolicy: Always
    extraEnv:
      EXTRA_CONDA_PACKAGES: "dask-gateway jupyterhub-singleuser"
$ helm upgrade --install dhub dask/daskhub --values=secrets.yml --values config.yaml 
Hub Logs
$ kubectl logs hub-5857bf9fb8-7thrh 
Loading /usr/local/etc/jupyterhub/secret/values.yaml
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
Loading extra config: 00-add-dask-gateway-values
Setting DASK_GATEWAY__ADDRESS http://proxy-public/services/dask-gateway
Adding dask-gateway service URL
[I 2021-09-17 04:43:01.183 JupyterHub app:2459] Running JupyterHub version 1.4.2
[I 2021-09-17 04:43:01.183 JupyterHub app:2489] Using Authenticator: jupyterhub.auth.DummyAuthenticator-1.4.2
[I 2021-09-17 04:43:01.183 JupyterHub app:2489] Using Spawner: kubespawner.spawner.KubeSpawner-1.1.0
[I 2021-09-17 04:43:01.183 JupyterHub app:2489] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.4.2
[W 2021-09-17 04:43:01.201 JupyterHub app:1808] No admin users, admin interface will be unavailable.
[W 2021-09-17 04:43:01.201 JupyterHub app:1809] Add any administrative users to `c.Authenticator.admin_users` in config.
[I 2021-09-17 04:43:01.201 JupyterHub app:1838] Not using allowed_users. Any authenticated user will be allowed.
[I 2021-09-17 04:43:01.243 JupyterHub provider:576] Updating oauth client service-dask-gateway
[I 2021-09-17 04:43:01.303 JupyterHub reflector:275] watching for pods with label selector='component=singleuser-server' in namespace default
[I 2021-09-17 04:43:01.328 JupyterHub reflector:275] watching for events with field selector='involvedObject.kind=Pod' in namespace default
[W 2021-09-17 04:43:01.344 JupyterHub _version:41] Single-user server has no version header, which means it is likely < 0.8. Expected 1.4.2
[I 2021-09-17 04:43:01.345 JupyterHub app:2186] test still running
[I 2021-09-17 04:43:01.347 JupyterHub app:2526] Initialized 1 spawners in 0.078 seconds
[I 2021-09-17 04:43:01.360 JupyterHub app:2738] Not starting proxy
[I 2021-09-17 04:43:01.364 JupyterHub app:2774] Hub API listening on http://:8081/hub/
[I 2021-09-17 04:43:01.364 JupyterHub app:2776] Private Hub API connect url http://hub:8081/hub/
[I 2021-09-17 04:43:01.364 JupyterHub app:2789] Starting managed service cull-idle
[I 2021-09-17 04:43:01.365 JupyterHub service:339] Starting service 'cull-idle': ['python3', '-m', 'jupyterhub_idle_culler', '--url=http://localhost:8081/hub/api', '--timeout=3600', '--cull-every=600', '--concurrency=10']
[I 2021-09-17 04:43:01.368 JupyterHub service:121] Spawning python3 -m jupyterhub_idle_culler --url=http://localhost:8081/hub/api --timeout=3600 --cull-every=600 --concurrency=10
[I 2021-09-17 04:43:01.378 JupyterHub app:2798] Adding external service dask-gateway at http://traefik-dhub-dask-gateway.default
[I 2021-09-17 04:43:01.387 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:43:01.390 JupyterHub app:2849] JupyterHub is now running at http://:8000
[I 2021-09-17 04:43:01.497 JupyterHub log:189] 200 GET /hub/api/users (cull-idle@127.0.0.1) 28.40ms
[I 2021-09-17 04:43:02.628 JupyterHub log:189] 200 GET /hub/error/503?url=%2Flab (@172.17.0.1) 39.85ms
[I 2021-09-17 04:43:54.143 JupyterHub log:189] 200 GET /hub/error/503?url=%2Flab (@172.17.0.1) 3.30ms
[I 2021-09-17 04:43:54.159 JupyterHub log:189] 302 GET / -> /hub/ (@::ffff:172.17.0.1) 1.77ms
[I 2021-09-17 04:43:54.242 JupyterHub log:189] 302 GET /hub/ -> /user/test/ (test@::ffff:172.17.0.1) 18.78ms
[W 2021-09-17 04:44:01.347 JupyterHub app:2131] User test server stopped with exit code: 1
[I 2021-09-17 04:44:01.347 JupyterHub proxy:309] Removing user test from proxy (/user/test/)
[I 2021-09-17 04:44:01.399 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:44:12.641 JupyterHub log:189] 200 GET /hub/error/503?url=%2Fuser%2Ftest%2F (@172.17.0.1) 3.06ms
[I 2021-09-17 04:44:17.512 JupyterHub log:189] 200 GET /hub/home (test@::ffff:172.17.0.1) 26.76ms
[I 2021-09-17 04:44:21.444 JupyterHub provider:574] Creating oauth client jupyterhub-user-test
[I 2021-09-17 04:44:21.466 JupyterHub spawner:2344] Attempting to create pvc claim-test, with timeout 3
[I 2021-09-17 04:44:21.469 JupyterHub log:189] 302 GET /hub/spawn/test -> /hub/spawn-pending/test (test@::ffff:172.17.0.1) 60.57ms
[I 2021-09-17 04:44:21.489 JupyterHub spawner:2361] PVC claim-test already exists, so did not create new pvc.
[I 2021-09-17 04:44:21.498 JupyterHub spawner:2302] Attempting to create pod jupyter-test, with timeout 3
[I 2021-09-17 04:44:21.525 JupyterHub pages:402] test is pending spawn
[I 2021-09-17 04:44:21.532 JupyterHub log:189] 200 GET /hub/spawn-pending/test (test@::ffff:172.17.0.1) 12.44ms
[W 2021-09-17 04:44:40.970 JupyterHub _version:41] Single-user server has no version header, which means it is likely < 0.8. Expected 1.4.2
[I 2021-09-17 04:44:40.970 JupyterHub base:909] User test took 19.557 seconds to start
[I 2021-09-17 04:44:40.970 JupyterHub proxy:285] Adding user test to proxy /user/test/ => http://172.17.0.10:8888
[I 2021-09-17 04:44:40.975 JupyterHub users:677] Server test is ready
[I 2021-09-17 04:44:40.976 JupyterHub log:189] 200 GET /hub/api/users/test/server/progress (test@::ffff:172.17.0.1) 19131.77ms
[I 2021-09-17 04:44:41.104 JupyterHub log:189] 302 GET /hub/spawn-pending/test -> /user/test/ (test@::ffff:172.17.0.1) 6.18ms
[I 2021-09-17 04:44:41.343 JupyterHub log:189] 302 GET /static/style/bootstrap.min.css?v=0e8a7fbd6de23ad6b27ab95802a0a0915af6693af612bc304d83af445529ce5d95842309ca3405d10f538d45c8a3a261b8cff78b4bd512dd9effb4109a71d0ab -> /hub/static/style/bootstrap.min.css?v=0e8a7fbd6de23ad6b27ab95802a0a0915af6693af612bc304d83af445529ce5d95842309ca3405d10f538d45c8a3a261b8cff78b4bd512dd9effb4109a71d0ab (@::ffff:172.17.0.1) 1.81ms
[I 2021-09-17 04:44:41.346 JupyterHub log:189] 302 GET /static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3 -> /hub/static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3 (@::ffff:172.17.0.1) 2.25ms
[I 2021-09-17 04:44:41.348 JupyterHub log:189] 302 GET /static/style/index.css?v=06e1f33518235bf36a1673b70c8f205bb706470f69d8ec46149d883cf17c53c9719e60341439e57b4057b468037216d7320fcc8afb920fdb6216b37aed5277f6 -> /hub/static/style/index.css?v=06e1f33518235bf36a1673b70c8f205bb706470f69d8ec46149d883cf17c53c9719e60341439e57b4057b468037216d7320fcc8afb920fdb6216b37aed5277f6 (@::ffff:172.17.0.1) 2.83ms
[W 2021-09-17 04:44:41.404 JupyterHub log:189] 404 GET /hub/static/style/bootstrap.min.css?v=0e8a7fbd6de23ad6b27ab95802a0a0915af6693af612bc304d83af445529ce5d95842309ca3405d10f538d45c8a3a261b8cff78b4bd512dd9effb4109a71d0ab (@::ffff:172.17.0.1) 2.54ms
[W 2021-09-17 04:44:41.406 JupyterHub log:189] 404 GET /hub/static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3 (@::ffff:172.17.0.1) 3.13ms
[W 2021-09-17 04:44:41.409 JupyterHub log:189] 404 GET /hub/static/style/index.css?v=06e1f33518235bf36a1673b70c8f205bb706470f69d8ec46149d883cf17c53c9719e60341439e57b4057b468037216d7320fcc8afb920fdb6216b37aed5277f6 (@::ffff:172.17.0.1) 1.65ms
[I 2021-09-17 04:44:41.462 JupyterHub log:189] 302 GET /static/logo/logo.png?v=a2a176ee3cee251ffddf5fa21fe8e43727a9e5f87a06f9c91ad7b776d9e9d3d5e0159c16cc188a3965e00375fb4bc336c16067c688f5040c0c2d4bfdb852a9e4 -> /hub/static/logo/logo.png?v=a2a176ee3cee251ffddf5fa21fe8e43727a9e5f87a06f9c91ad7b776d9e9d3d5e0159c16cc188a3965e00375fb4bc336c16067c688f5040c0c2d4bfdb852a9e4 (@::ffff:172.17.0.1) 1.60ms
[W 2021-09-17 04:44:41.515 JupyterHub log:189] 404 GET /hub/static/logo/logo.png?v=a2a176ee3cee251ffddf5fa21fe8e43727a9e5f87a06f9c91ad7b776d9e9d3d5e0159c16cc188a3965e00375fb4bc336c16067c688f5040c0c2d4bfdb852a9e4 (@::ffff:172.17.0.1) 4.92ms
[I 2021-09-17 04:45:01.413 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:46:01.405 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:47:01.404 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:48:01.404 JupyterHub proxy:347] Checking routes
[I 2021-09-17 04:49:01.403 JupyterHub proxy:347] Checking routes
User Pod Logs
$ kubectl logs jupyter-test
+ '[' 1000 -eq 0 ']'
+ '[' -e /opt/app/environment.yml ']'
+ echo 'no environment.yml'
no environment.yml
+ '[' 'dask-gateway jupyterhub-singleuser' ']'
+ echo 'EXTRA_CONDA_PACKAGES environment variable found.  Installing.'
+ /opt/conda/bin/conda install -y dask-gateway jupyterhub-singleuser
EXTRA_CONDA_PACKAGES environment variable found.  Installing.
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /opt/conda

  added / updated specs:
    - dask-gateway
    - jupyterhub-singleuser


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    dask-gateway-0.9.0         |   py38h578d9bd_2          45 KB  conda-forge
    jupyterhub-singleuser-1.4.2|   py38h578d9bd_0           5 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          50 KB

The following NEW packages will be INSTALLED:

  dask-gateway       conda-forge/linux-64::dask-gateway-0.9.0-py38h578d9bd_2
  jupyterhub-single~ conda-forge/linux-64::jupyterhub-singleuser-1.4.2-py38h578d9bd_0



Downloading and Extracting Packages
dask-gateway-0.9.0   | 45 KB     | ########## | 100% 
jupyterhub-singleuse | 5 KB      | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
+ '[' '' ']'
+ exec start.sh jupyter lab
Executing the command: jupyter lab
[I 2021-09-17 04:44:39.949 ServerApp] dask_labextension | extension was successfully linked.
[I 2021-09-17 04:44:39.949 ServerApp] jupyter_server_proxy | extension was successfully linked.
[I 2021-09-17 04:44:39.957 ServerApp] jupyterlab | extension was successfully linked.
[W 2021-09-17 04:44:39.959 NotebookApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-09-17 04:44:39.959 NotebookApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-09-17 04:44:39.959 NotebookApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2021-09-17 04:44:40.603 ServerApp] nbclassic | extension was successfully linked.
[I 2021-09-17 04:44:40.654 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-09-17 04:44:40.655 ServerApp] dask_labextension | extension was successfully loaded.
[I 2021-09-17 04:44:40.660 ServerApp] jupyter_server_proxy | extension was successfully loaded.
[I 2021-09-17 04:44:40.661 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
[I 2021-09-17 04:44:40.661 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2021-09-17 04:44:40.663 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-09-17 04:44:40.663 ServerApp] Serving notebooks from local directory: /home/jovyan
[I 2021-09-17 04:44:40.663 ServerApp] Jupyter Server 1.10.2 is running at:
[I 2021-09-17 04:44:40.663 ServerApp] http://jupyter-test:8888/lab?token=54c6a9ee2e64588d57492189e6a8ba2f7c0ad8a556a44498
[I 2021-09-17 04:44:40.663 ServerApp]  or http://127.0.0.1:8888/lab?token=54c6a9ee2e64588d57492189e6a8ba2f7c0ad8a556a44498
[I 2021-09-17 04:44:40.663 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2021-09-17 04:44:40.665 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///home/jovyan/.local/share/jupyter/runtime/jpserver-7-open.html
    Or copy and paste one of these URLs:
        http://jupyter-test:8888/lab?token=54c6a9ee2e64588d57492189e6a8ba2f7c0ad8a556a44498
     or http://127.0.0.1:8888/lab?token=54c6a9ee2e64588d57492189e6a8ba2f7c0ad8a556a44498
[W 2021-09-17 04:44:41.225 ServerApp] 404 GET /user/test (172.17.0.12) 28.72ms referer=http://127.0.0.1:8001/hub/spawn-pending/test

Is jupyterhub-singleuser a conda package? I think jupyterhub-base is one. Are you sure you startup with the latest jupyterhub-singleuser script? The extra package thing must install the extra packages before running the jupyterhub-singleuser script that bundles as an entrypoint part of the python package jupyterhub, and jupyterhub-base contain that without dependecies like node only relevant if you run jh locally etc.

I think w. K8s 1.22, you need a new release of dask-gateway helm chart for several reasons, ive fixed them in current main branch but its not released yet.

There are also other things to fix, such disabling hub.networkPolicy.enabled=false, also fixed in latest main.

All help working towards a release of dask/dask-gateway appreciated.

When I raised this issue I had successfully created a custom image (based on RAPIDS) which just added dask-gateway and jupyterhub-singleuser. Although I can't seem to find it now.

Is jupyterhub-singleuser a conda package?

Yeah

All help working towards a release of dask/dask-gateway appreciated.

This came up again at the Dask maintainers meeting this week. We are blocked with @jcrist being the only person who can do this. It looks like Jim should have some time for this soon though.

OK, so it sounds like there is still some work to do here. Deferring discussion of dask-gateway (for the moment -- let's come back to that), what needs to be done just to get jupyter working?

The base-notebook Dockerfile has:

# Configure container startup
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]

On the raw jupyter front, the dask entrypoint runs a prepare.sh script instead of tini, that installs packages and then does:

exec start.sh jupyter lab ${JUPYTERLAB_ARGS}

It seems tricky to do all 3:

  1. install packages
  2. keep the tini entrypoint
  3. run start-singleuser.sh

Would switching prepare.sh to use exec start-singleuser.sh ${JUPYTERLAB_ARG} be acceptable (it continues to skip tini)?

Can we not change the entrypoint to prepare.sh start-singleuser.sh ${JUPYTERLAB_ARG}? The prepare script executes $@ anyway. We still lose tini here, but is that a big problem?

I think you are looking at the wrong prepare script. The one I referenced is in the notebook subdirectory. It's a bit convoluted -- maybe we should cleanup the prepare/entrypoint/cmd here.

I just verified that switching to exec start-singleuser.sh ${JUPYTERLAB_ARG} makes things work again -- no 404s.

It seems singleuser.defaultUrl isn't working to set the default to lab. Maybe it's related to
jupyterhub/kubespawner#493