DmitryUlyanov / texture_nets

Code for "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images" paper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

invalid device ordinal

wq409813230 opened this issue · comments

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 :
invalid device ordinal
/root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734
stack traceback:
[C]: in function 'setDevice'
test.lua:26: in main chunk
[C]: in function 'dofile'
...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670
`
bellow is my GPU info

`+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P5000 Off | 0000:03:00.0 On | Off |
| 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1387 G /usr/lib/xorg/Xorg 108MiB |
+-----------------------------------------------------------------------------+
`
I really have no idea where the problem is.

Hi,Dear Dmitry,thank you for your reply.but it still failed when I ignore the -gpu argument.what makes me confused is that the chainer-fast-neuralstyle
implemented with python also has the '-gpu' argument, and it runs well when I set -gpu 0.
qq 20170622160836

Hi ,
This issue still persist any one found a solution for it

the gpu index starts from 1, pls try to use option -gpu 1 instead of -gpu 0

i also get this error. whatever gpu id i input. cudnn works fine on chainer.

my setup info:
ubuntu 16.04 torch7 cuda9.2 cudnn7.1.4
``
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26 Driver Version: 396.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 Off | 00000000:01:00.0 On | N/A |
| 0% 46C P8 17W / 163W | 455MiB / 4040MiB | 1% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 958 G /usr/lib/xorg/Xorg 287MiB |
| 0 1897 G compiz 164MiB |
+-----------------------------------------------------------------------------+
``

i think it might be because of torch7 being by default for cudnn r5 ?!

i had to run git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec to get cudnn7 recognized by torch. and had to re-do luarocks install cunn and luarocks install cutorch after that, but now get this same "invalid device ordinal" error.

maybe it's having some sort of version mismatch of cudnn cunn and cutorch? don't know where the cunn.torch and cutorch.torch compliant with cudnn.torch R7 might be located. anyone has any clue?

i'm not used to ubuntu and lua :S

found https://github.com/torch/cutorch/issues
and yeah, doesn't look like they support cuda 9 yet, that's probably the issue here i think. :/
if anyone has any other insights beyond "try downgrading", i'd appreciate the input.