jiasenlu / HieCoAttenVQA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About error from executing "train.lua"

rjh7008 opened this issue · comments

I just follow the step of README, however I got some error as below.
This is the step of vgg image feature.
I don't know how to solve it.

-------------------the conents of error message ---------------------------

iter 0: 6.952219, 0.011587, 0.000400, 0.397509
validation loss: =======6.9374270200729=accuracy =======0====== 5120/5000 =========] Tot: 3s108ms | Step: 0ms
wrote json checkpoint to save/train_vgg_Alternating/checkpoint.json.json
/home/user/vqainstall/distro-cl/install/bin/luajit: ...o-cl/install/share/lua/5.1/cudnn/TemporalConvolution.lua:92: bad argument #1 to 'size' (out of range)
stack traceback:
[C]: in function 'size'
...o-cl/install/share/lua/5.1/cudnn/TemporalConvolution.lua:92: in function 'updateGradInput'
...tall/distro-cl/install/share/lua/5.1/nngraph/gmodule.lua:350: in function 'neteval'
...tall/distro-cl/install/share/lua/5.1/nngraph/gmodule.lua:384: in function 'updateGradInput'
...vqainstall/distro-cl/install/share/lua/5.1/nn/Module.lua:30: in function 'backward'
./misc/phrase_level.lua:85: in function 'updateGradInput'
...vqainstall/distro-cl/install/share/lua/5.1/nn/Module.lua:30: in function 'backward'
train.lua:278: in function 'lossFun'
train.lua:312: in main chunk
[C]: in function 'dofile'
.../distro-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405e90

that is a problem caused by new torch version. Fixed now

I also follow the step of READMe, and I got this error.
It looks similar with the closed issue, but I'm using the most recent version of torch.
Can someone help me how to solve this problems?
By the way this is train.lua step
---------------------------------------error--------------------------------
~
constructing clones inside the ques_level
total number of parameters in recursive_attention: 2862056

/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/nn/THNN.lua:110: input and gradOutput have different number of elements: input[20 x 26] has 520 elements, while gradOutput[26] has 26 elements at /home/user/torch/extra/cunn/lib/THCUNN/generic/SoftMax.cu:84
stack traceback:
[C]: in function 'v'
/home/user/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SoftMax_updateGradInput'
./misc/maskSoftmax.lua:33: in function 'updateGradInput'
.../user/torch/install/share/lua/5.1/nngraph/gmodule.lua:420: in function 'neteval'
.../user/torch/install/share/lua/5.1/nngraph/gmodule.lua:454: in function 'updateGradInput'
/home/user/torch/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
./misc/ques_level.lua:143: in function 'updateGradInput'
/home/user/torch/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
train.lua:274: in function 'lossFun'
train.lua:313: in main chunk
[C]: in function 'dofile'
...usr/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

@jiasenlu can you tell me how did you fix the error?