LuoweiZhou / VLP

Vision-Language Pre-training for Image Captioning and Question Answering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Possibility of bugs when not enable butd - decode_img2txt.py and seq2seq_loader.py

windspirit95 opened this issue · comments

Hi,
I am trying to modify your decode_img2txt.py and seq2seq_loader.py to inference to a single image (not whole dataset such as COCO).
Therefore in case we do not have region_bbox_file as well as region_det_file_prefix (so enable_butd is False), should we uncomment these lines for cnn model: 96, 171, 172, 181, 185?
The other issue (could be a bug) is in seq2seq_loader.py, the vis_pe is only defined in "else" part (from line 442 when enable_butd == True), so I wonder how to define it when enable_butd == False? Could it be []?
I am the newbie in NLP so it could be a little bit confuse for me, thank you very much for open source your great work 💯

Recently I have found closed issued #26 maybe relate to my issue, so I will try it and close this issue. Thank you ;)