Maluuba / GeNeVA

Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction"

Home Page:https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is your pytorch version?

zmykevin opened this issue · comments

Hi I wonder which pytorch version you use? I ran into some weird warning issue. The one that bothers a lot is this warning message:
/opt/conda/conda-bld/pytorch_1573049304260/work/aten/src/ATen/native/cudnn/RNN.cpp:1268: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().

I think it is because you are applying DataParallel on the RNN, but I am not quite sure how to resolve it.
Thank you for taking a look at this issue.

We are using 0.4.1

GeNeVA/environment.yml

Lines 13 to 14 in 7e8d597

- pytorch=0.4.1=py36_cuda9.0.176_cudnn7.1.2_1
- torchvision=0.2.1=py36_1

Can you tell me the command you are running and the line in the code that generates this warning? We already call flatten_parameters() so this should ideally not happen.

self.gru.flatten_parameters()

We are using 0.4.1

GeNeVA/environment.yml

Lines 13 to 14 in 7e8d597

- pytorch=0.4.1=py36_cuda9.0.176_cudnn7.1.2_1
- torchvision=0.2.1=py36_1

Can you tell me the command you are running and the line in the code that generates this warning? We already call flatten_parameters() so this should ideally not happen.

self.gru.flatten_parameters()

I see, for some reason when I conda install from that environment file, it does not install the POytorch. So I just install the latest pytorch to run it.
The command line I run is just the one that you give to train the model on CoDraw dataset:

python geneva/inference/train.py @example_args/codraw-d-subtract.args

The place that causes that warning is probably come from here:

self.rnn = nn.DataParallel(nn.GRU(cfg.input_dim,
cfg.hidden_dim,
batch_first=False), dim=1).cuda()

I will try downgrade the pytorch version first to see if I can resolve the warning.

Another quick question is how much time does it take to train this model with your defaut setting with 2 P100 GPUs?

let me know if the warning does not go away with the pytorch version change
iirc to get comparable results as the paper it takes ~3 days

Closing due to inactivity. Please reopen with updates if the issue still remains.