Found no NVIDIA driver on your system
WurmD opened this issue · comments
Hello,
After building the image, and running it
sudo docker build --tag testimage .
sudo docker run -t -i --privileged testimage bash
cd rcic/
python main.py --save testrun
We get
Traceback (most recent call last):
File "main.py", line 504, in <module>
main(args)
File "main.py", line 484, in main
model = ModelAndLoss(args).cuda()
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 297, in cuda
return self._apply(lambda t: t.cuda(device))
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 194, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 194, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 194, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 216, in _apply
param_applied = fn(param)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 297, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 178, in _lazy_init
_check_driver()
File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 99, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
Note that outside docker the GPU is working as intended
$ nvidia-smi
Sat Sep 26 17:26:50 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00 Driver Version: 440.64.00 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 On | 00000000:01:00.0 Off | N/A |
| 33% 41C P8 11W / 180W | 1MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Am I not running your code as intended?
What were the steps you took to run the code in docker in your machine?
Add --gpus=all
to your docker run
command or change docker run
to nvidia-docker run
. By default docker doesn't pass gpus to the container. If you do this, you should be able to run nvidia-smi
inside the container and get the same output as run outside the container.
I confirm installing nvidia-container-toolkit
as per https://stackoverflow.com/a/58432877/1734357 and then adding --gpus=all
to docker run
resolves it