facebookresearch / deepmask

Torch implementation of DeepMask and SharpMask

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

error when run train.lua

noahcao opened this issue · comments

hi, i met a terrible error when runing train.lua to train the deepmask, which returned a pretty long stack traceback as below:

-- ignore option rundir
-- ignore option dm
-- ignore option reload
-- ignore option gpu
-- ignore option datadir
| running in directory /home/jinkun/coco/deepmask/deepmask/exps/deepmask/exp
| number of paramaters trunk: 15198016
| number of paramaters mask branch: 1608768
| number of paramaters score branch: 526337
| number of paramaters total: 17333121
/home/jinkun/torch/install/bin/luajit: ...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:183: [thread 2 callback] /home/jinkun/torch/install/share/lua/5.1/coco/CocoApi.lua:142: Expected value but found T_COLON at character 1
stack traceback:
[C]: in function 'decode'
/home/jinkun/torch/install/share/lua/5.1/coco/CocoApi.lua:142: in function '__convert'
/home/jinkun/torch/install/share/lua/5.1/coco/CocoApi.lua:128: in function '__init'
/home/jinkun/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/jinkun/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'CocoApi'
/home/jinkun/coco/deepmask/deepmask/DataSampler.lua:25: in function '__init'
/home/jinkun/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/jinkun/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'DataSampler'
/home/jinkun/coco/deepmask/deepmask/DataLoader.lua:36: in function </home/jinkun/coco/deepmask/deepmask/DataLoader.lua:30>
[C]: in function 'xpcall'
...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:234: in function 'callback'
/home/jinkun/torch/install/share/lua/5.1/threads/queue.lua:65: in function </home/jinkun/torch/install/share/lua/5.1/threads/queue.lua:41>
[C]: in function 'pcall'
/home/jinkun/torch/install/share/lua/5.1/threads/queue.lua:40: in function 'dojob'
[string " local Queue = require 'threads.queue'..."]:13: in main chunk
stack traceback:
[C]: in function 'error'
...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:183: in function 'dojob'
...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:264: in function 'synchronize'
...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:142: in function 'specific'
...e/jinkun/torch/install/share/lua/5.1/threads/threads.lua:125: in function 'Threads'
/home/jinkun/coco/deepmask/deepmask/DataLoader.lua:40: in function '__init'
/home/jinkun/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/jinkun/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'DataLoader'
/home/jinkun/coco/deepmask/deepmask/DataLoader.lua:21: in function 'create'
train.lua:101: in main chunk
[C]: in function 'dofile'
...nkun/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
convert: data//annotations/instances_train2014.json --> .t7 [please be patient]
convert: data//annotations/instances_train2014.json --> .t7 [please be patient]

Not sure if you've fixed the problem but if not, the problem seems to be that you have a colon or some odd character at position 1 of your json file. I'd check that and see if removing it fixes your problems.

Thanks, and it seems that this problem happens when some error happens when writing json file or when using torch based on LUAJIT (I rewrote the json file and replaced torch to the version based on LUA5.2, then this problem is fixed)

Ok, sounds good. I have a similar error and was looking at previous issues and someone had a similar issue. I might have to do that as well to fix my problem. I tried rewriting and messing around the json file to see if it would fix it but nothing happened. Guess I have to use LUA5.2 :/ Thanks for the insight and help!

Hmmm I'm still getting the error even after switching to Lua5.2. How did you rewrite the json file? I tried using split and cat but it still resulted in the same error. Also, did you manage to get the computeProposals.lua to run without getting the unknown object error when on Lua5.2?

I'm running into a similar issue. The error I got was "Expected value but found T_END at character 1", not T_COLON.

I was following this section https://github.com/facebookresearch/deepmask#training-your-own-model and downloaded/extracted the COCO train/val 2014 dataset.

Anyone willing to share how to fix it?

Thanks!