Update CUDA to 11.8

Question

Update CUDA to 11.8

vanHavel opened this issue 8 months ago · comments

Hi,

first of all thank you for providing this docker image, it is very useful.

I have created a fork where I upgraded CUDA to 11.8 to use with newer versions of Tensorflow. The related PR is here: #122

There are a few small caveats listed in the description of the PR. Nevertheless, I hope it will be a good start for folks looking to use the image with newer CUDA.

Olivier Benz · Answer 1 · Thu Nov 30 2023 01:05:54 GMT+0800 (China Standard Time)

@vanHavel You may be interested in b-data's/my GPU accelerated JupyterLab docker stacks.

(currently) Based on nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04; including code-server – aka VS Code in the browser.

Michael Pilosov · Answer 2 · Fri Dec 08 2023 01:46:01 GMT+0800 (China Standard Time)

@benz0li

thanks for the links. I took a look and have a question for you:

❗ Always mount the user's entire home directory.
Mounting a subfolder prevents the container from starting.[1](https://github.com/b-data/jupyterlab-python-docker-stack/blob/main/CUDA.md#user-content-fn-1-fca8fa3aa93e9d6a945115ed9b64e882)

this is different than how the (vanilla) jupyter docker-stacks images work, as they default to mounting ~/work as a persistent storage with a docker volume (for things like jupyterhub).
So I'm wondering, why this design change? From my perspective, it makes user-experience sense in that things like dotfiles and conda envs will persist... but from the other hand, part of the benefit of the ephemeral compute is that you / your users can totally "mess up" things and just start fresh with a new container. A lot of the ways in which one can "mess up" are directly related to ~/.local, ~/.conda and the like.
So yes, it's convenient to not lose settings such as ~/.ssh and VSCode's settings ~/.vscode, but this tracking of state can lead to one being unable to recover a working compute environment.

It's been years since I've thought about this decision, but many years ago I was managing a jupyterhub for hundreds of students and this discussion about the persistence of ~/work vs ~/ was a topic of discussion then, (as many students were having their first python experience on this, we were designing for robustness and making some convenience compromises).

So I was wondering if something's changed or if you had some thoughts on this.

+1 for the vscode inclusion, I do that as well for my images which have their base images taken from here.

Olivier Benz · Answer 3 · Fri Dec 08 2023 02:38:13 GMT+0800 (China Standard Time)

@benz0li

thanks for the links. I took a look and have a question for you:
❗ Always mount the user's entire home directory.
Mounting a subfolder prevents the container from starting.[1](https://github.com/b-data/jupyterlab-python-docker-stack/blob/main/CUDA.md#user-content-fn-1-fca8fa3aa93e9d6a945115ed9b64e882)
this is different than how the (vanilla) jupyter docker-stacks images work, as they default to mounting ~/work as a persistent storage with a docker volume (for things like jupyterhub). So I'm wondering, why this design change?

See jupyter/docker-stacks#1478.

From my perspective, it makes user-experience sense in that things like dotfiles and conda envs will persist...

There is no Conda in b-data's/my images.

but from the other hand, part of the benefit of the ephemeral compute is that you / your users can totally "mess up" things and just start fresh with a new container.

You can start fresh with a new container with b-data's/my images, too.

A lot of the ways in which one can "mess up" are directly related to ~/.local, ~/.conda and the like. So yes, it's convenient to not lose settings such as ~/.ssh and VSCode's settings ~/.vscode, but this tracking of state can lead to one being unable to recover a working compute environment.

True. (If a user messes up, the JupyterHub admin must step in)

It's been years since I've thought about this decision, but many years ago I was managing a jupyterhub for hundreds of students and this discussion about the persistence of ~/work vs ~/ was a topic of discussion then, (as many students were having their first python experience on this, we were designing for robustness and making some convenience compromises).

IMHO users should have all the freedom in their home directory. E.g. in the case of b-data's/my images even installing Miniconda or Micromamba at user level – persistently.

So I was wondering if something's changed or if you had some thoughts on this.

For my thoughts, see b-data/jupyterlab-python-docker-stack#1 (comment).

There are startup hooks in place. Especially /usr/local/bin/before-notebook.d/10-init.sh which allows mounting the same home directory with all of b-data's JupyterLab docker stacks – repeatedly.
🔬 Demo environment: https://demo.jupyter.b-data.ch. Login with GitHub account.
ℹ️ See Notes for all the differences to the (vanilla) jupyter docker-stacks images.
👉 I.e. tweaks, settings, etc. that can be applied at user-level for customisation.

By allowing users to persistently install Python packages at user level b-data's/my docker stacks do not require separate images for simply installing python packages like TensorFlow or PyTorch.

b-data's/my images also support Docker/Podman in rootless mode. I have opened a pull request to "backport" this feature to the (vanilla) jupyter docker-stacks images: jupyter/docker-stacks#2039

Olivier Benz · Answer 4 · Fri Dec 08 2023 02:45:00 GMT+0800 (China Standard Time)

Furthermore there are no GPU accelerated (vanilla) jupyter docker-stacks images and this repository misses some essential features:

Our dear colleagues from the Rocker project have a different problem:

rocker-org/rocker-versioned2#736

Olivier Benz · Answer 5 · Fri Dec 08 2023 02:59:39 GMT+0800 (China Standard Time)

+1 for the vscode inclusion, I do that as well for my images which have their base images taken from here.

Not VS Code but code-server – aka VS Code in the browser plus some additional features.
ℹ️ There are b-data's/my Data Science Dev Containers for use with 'VS Code'/Codespaces.

Michael Pilosov · Answer 6 · Sun Dec 10 2023 03:01:36 GMT+0800 (China Standard Time)

@benz0li thank you for the detailed response. I'm reading through the links you provided to issue discussions and it's clear you've put a lot of thought and work into the design changes. The approach to optionally bind-mounting home / having a population script if empty... is not something I considered.

You solved for one of the most painful UX: the state preservation of ~/

(and yes, my mistake, I did mean code-server).

I think you've aimed at a "one image for everything" design, whereas yes, I believe the design approach before was that users run multiple servers in jupyterhub for their different images, which provides possibly "too much" isolation / a lot of disk consumption on the host OS running docker.

And the fact you got it working with rootless docker... wow, that gives me a lot of trust in your technical abilities. Props. That's quite a painful exercise (I haven't tried it here but have migrated other images before opting out of podman entirely).

Props also for the busy script during tmux/screen. Good idea.

So I think you've made some great choices. As you mentioned, the tradeoff for more statefulness comes at the potential need for more admin-interference. So the question becomes: are you deploying for a fleet of inexperienced users (jupyterhub got its start as a product for Berkeley's students), or power-users? For GPU-enabled images, I think the answer leans far more towards the latter.

I appreciate you taking the time to explain all that.
Would you please be so kind as to whitelist me for your demo server? I'm going to spend some time playing around with your set up on my servers as well, but I'm really pressed for time lately.

One random personal style question:
Why the switch to zsh? Would it be easy to default to bash instead?
I'm one of those still-defaults-my-mac-to-bash people, mostly because of its presence on random servers I need to configure, and zsh out of the box does things like ruin pip install package[options] syntax.

Olivier Benz · Answer 7 · Sun Dec 10 2023 05:11:17 GMT+0800 (China Standard Time)

are you deploying for a fleet of inexperienced users (jupyterhub got its start as a product for Berkeley's students), or power-users?

My images are intended for power users. A user [of b-data's/my images] should have more than just basic Linux knowledge.

Why the switch to zsh?

I simply like Zsh; further enhanced with

Framework: Oh My Zsh
Theme: Powerlevel10k
Font: MesloLGS NF

Would it be easy to default to bash instead?

Try starting the image with -e SHELL=/usr/bin/bash.

Olivier Benz · Answer 8 · Sun Dec 10 2023 05:20:56 GMT+0800 (China Standard Time)

Would you please be so kind as to whitelist me for your demo server?

Done. I have whitelisted your account (@mathematicalmichael) for https://demo.cuda.jupyter.b-data.ch.

(Anyone with a GitHub account may log in at https://demo.jupyter.b-data.ch)

Sometimes it does not start at first try. Simply try again...

Olivier Benz · Answer 9 · Sun Dec 10 2023 05:24:33 GMT+0800 (China Standard Time)

I'm one of those still-defaults-my-mac-to-bash people, mostly because of its presence on random servers I need to configure, and zsh out of the box does things like ruin pip install package[options] syntax.

@mathematicalmichael Can you give an example that does not work with my JupyterLab docker stacks?

Michael Pilosov · Answer 10 · Sun Dec 10 2023 06:27:22 GMT+0800 (China Standard Time)

@benz0li thank you!

with respect to the zsh question, that was just memory from when macOS switched the default shell, and I found that without further configuring it, the [] characters were being interpreted by zsh instead of pip

e.g., in your jupyterhub, pip install hiplot[dev] fails; I have to use quotes: pip install 'hiplot[dev]'. This is practically the only thing I remember about zsh when I first tried it.. that optional python dependencies would fail (and as a package developer, I often rely on these), and rather than learn how to configure a new shell, I stuck to bash.

thanks for the instruction on how to override the default config, and that does make sense that you're targeting power-users. The original docker-stacks (in my impression) do not necessarily assume the users are comfortable with linux, and I believe that's why ~/ was not persistent (as annoying as that is to a power-user).

Olivier Benz · Answer 11 · Sun Dec 10 2023 06:46:30 GMT+0800 (China Standard Time)

I found that without further configuring it, the [] characters were being interpreted by zsh instead of pip

@mathematicalmichael

zsh uses square brackets for globbing / pattern matching.

[...]

If you want to disable globbing for the pip command permanently, you can do so by adding this to your ~/.zshrc:
alias pip='noglob pip'

– https://stackoverflow.com/a/30539963

Olivier Benz · Answer 12 · Sun Dec 10 2023 06:48:49 GMT+0800 (China Standard Time)

@mathematicalmichael

[...]

To get the Bash behavior in Zsh, add this to your ~/.zshrc file:
unsetopt NOMATCH
[...]

– https://superuser.com/a/1606090

Christoph · Answer 13 · Sat Dec 16 2023 15:58:02 GMT+0800 (China Standard Time)

Hi @vanHavel ,
thanks for your issue and PR. It was merged into the main branch (see #124 that builds upon your PR #122).

Christoph · Answer 14 · Sat Dec 16 2023 16:04:00 GMT+0800 (China Standard Time)

Also thanks to @benz0li for the detailed explanations!

Olivier Benz · Answer 15 · Mon Dec 18 2023 03:40:53 GMT+0800 (China Standard Time)

Why the switch to zsh?

Addendum: I could not get bash working properly with screen/tmux, i.e. PATH was not updated consistently; and PATH was updated differently in JupyterLab and code-server.

This is due to whether the shell is a 'Login shell' or not. Because

With JupyterHub
- In JupyterLab Terminal: The shell is a 'Login shell'
- In code-server Terminal: The shell is not a 'Login shell'
Without JupyterHub
- Both: The shell is not a 'Login shell'

👉 Using zsh, PATH is updated consistently for all configurations.

@mathematicalmichael With b-data/jupyterlab-r-docker-stack@5e2a258...6080796, bash now also updates PATH consistently for all configurations.

Michael Pilosov · Answer 16 · Thu Dec 21 2023 02:34:25 GMT+0800 (China Standard Time)

@benz0li I think this is bc zsh just reads ~/.zshrc regardless of interactivity, but bash will choose between ~/.bashrc (non-login), ~/.bash_profile / ~/.profile (interactive shell).

Thanks for digging into that, wasn't aware of how Jupyterhub impacts any of that.

Olivier Benz · Answer 17 · Sat Mar 02 2024 18:41:56 GMT+0800 (China Standard Time)

@mathematicalmichael ℹ️ I found a way to enable bind mounting a subfolder of the home directory for arbitrary $NB_USERs and thus resolve b-data/jupyterlab-python-docker-stack#1.

Users can now choose whether to (bind) mount the entire home directory or just a subfolder within it.