Another size of image

Question

Another size of image

brian411227 opened this issue 3 years ago · comments

Thank you for your research !
But now, I try to generate a new "checkpoint_512_celeba-hq.pt" for 512x512 size image. However, it still something wrong through test.py phase.

The error message shows that :
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

How can I do now ? Do you have the checkpoint_512_celeba-hq.pt file ?

Xinyang Li · Answer 1 · Mon Nov 08 2021 10:26:25 GMT+0800 (China Standard Time)

Try this simple command to test if the pytorch and cuda are installed successfully:

import torch
print(torch.__version__)
print(torch.version.cuda)

Did you train your own "checkpoint_512_celeba-hq.pt" since that it needs so much GPU memory as comment in https://github.com/imlixinyang/HiSD/issues/12#issuecomment-840404898 (i.e. at least 4x Tesla V100)?

The released checkpoint supports to use 512x512 image as input since it will automatically resize the image into corresponding resolution.

brian411227 · Answer 2 · Mon Nov 08 2021 12:26:31 GMT+0800 (China Standard Time)

Sorry, but I can't see what the content is in this url https://github.com/imlixinyang/HiSD/issues/12#issuecomment-840404898
It said "No results matched your search."

The result is that
torch.version = 1.0.1.post2
torch.version.cuda = 10.0.130

Now, I don't know that just only checkpoint_512 can output 512x512 image ?
Another question is where can I config the output image size ?
Thank you for your reply.

Xinyang Li · Answer 3 · Mon Nov 08 2021 12:53:39 GMT+0800 (China Standard Time)

Only checkpoint_512 which is trained with 512 resolution can output 512x512 image directly. If you want to use 256x256 checkpoint to output 512x512 image, you should upsample the output in the test code.
Please ensure the original test code can run successfully first and then try your own modification.

brian411227 · Answer 4 · Mon Nov 08 2021 13:18:41 GMT+0800 (China Standard Time)

Traceback (most recent call last):
File "core/train.py", line 64, in
G_adv, G_sty, G_rec, D_adv = trainer.update(x, y, i, j, j_trg)
File "/home/HiSD/core/trainer.py", line 139, in update
x_trg, x_cyc, s, s_trg = self.models((x, y, i, j, j_trg), mode='gen')
File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/HiSD/core/trainer.py", line 30, in forward
return self.gen_losses(*args)
File "/home/HiSD/core/trainer.py", line 40, in gen_losses
e = self.gen.encode(x)
File "/home/HiSD/core/networks.py", line 126, in encode
e = self.encoder(x)
File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 320, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I run the origin test code which produce above message.
I have no idea.

Xinyang Li · Answer 5 · Mon Nov 08 2021 13:34:45 GMT+0800 (China Standard Time)

It looks like that the reason is the python environment or packages. I recommend you to use conda environment (e.g., anaconda) and reinstall the cudatoolkit and pytorch following the repo.