Error while running both CPU or GPU mode

Question

Error while running both CPU or GPU mode

utsavgarg opened this issue 7 years ago · comments

/home/torch/install/bin/luajit: /home/torch/install/share/lua/5.1/nn/Linear.lua:66: size mismatch at /home/torch/extra/cutorch/lib/THC/THCTensorMathBlas.cu:90
stack traceback:
[C]: in function 'addmm'
/home/babu/torch/install/share/lua/5.1/nn/Linear.lua:66: in function 'func'
/home/babu/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/babu/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
./misc_new/LanguageModel.lua:277: in function 'sample'
eval_new.lua:135: in function 'eval_split'
eval_new.lua:173: in main chunk
[C]: in function 'dofile'
...babu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406620

Luowei Zhou · Answer 1 · Thu Dec 01 2016 22:51:54 GMT+0800 (China Standard Time)

Please refer to the code for NeuralTalk2 to run the code.
It seems like a size mismatch issue, please check ./misc_new/LanguageModel.lua:277: in function 'sample' to debug your settings.

Best

Utsav Garg · Answer 2 · Fri Dec 02 2016 15:05:17 GMT+0800 (China Standard Time)

I ran the NeuralTalk2 model and it runs fine.
The error is in ./misc_new/LanguageModel.lua:198: in function 'sample' where xt is being defined. Its giving a size of 2 1536 instead of 2 768.
What should I change in it so that it works fine ?

Arun Mallya · Answer 3 · Sat Mar 04 2017 01:10:10 GMT+0800 (China Standard Time)

Yup, same error with VGG as the input CNN.

MironaGamil · Answer 4 · Tue Apr 11 2017 20:45:10 GMT+0800 (China Standard Time)

@LuoweiZhou i have the same error, neuraltalk2 works fine but this doesn't work
any help please

Luowei Zhou · Answer 5 · Tue Apr 11 2017 23:17:27 GMT+0800 (China Standard Time)

@utsavgarg Hi, sorry for the delay. Have you figured out the issues? It works fine for me when I train my model from scratch. It notes that the pre-trained model of NeuralTalk2 has 768 input states (LSTM) which might not be evaluated directly with our code. Also, we double the input size of LSTM to allow the text-conditional guidance, that's probably why the size of xt should be 1536 instead of 768.

Luowei Zhou · Answer 6 · Tue Apr 11 2017 23:18:17 GMT+0800 (China Standard Time)

@arunmallya Hi, what model did you use to evaluate?

Luowei Zhou · Answer 7 · Tue Apr 11 2017 23:19:22 GMT+0800 (China Standard Time)

@MironaGamil I assume you directly feed the NeuralTalk2 pre-trained model into the eval code which would not work since the LSTM input structure is not the same.

Arun Mallya · Answer 8 · Tue Apr 11 2017 23:19:57 GMT+0800 (China Standard Time)

I trained from scratch using your code. Using the model saved after the first step, i tried to run the second step, which then failed giving the error above.

Luowei Zhou · Answer 9 · Tue Apr 11 2017 23:23:17 GMT+0800 (China Standard Time)

@arunmallya Just want to confirm. So, you got the error while fine-tuning the model? Or same as others while having the evaluation?

Arun Mallya · Answer 10 · Tue Apr 11 2017 23:55:01 GMT+0800 (China Standard Time)

Yes, I got the error while training/fine-tuning.
I used the following commands:

th train_new.lua -max_iters 250000 -finetune_cnn_after 100000
th train_sc.lua -max_iters 150000 -start_from <model_of_step_2>.t7