doubledaibo / gancaption_iccv2017

Towards Diverse and Natural Image Descriptions via a Conditional GAN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Input file structure

lineback opened this issue · comments

Could you please describe the structure of coco_cap_dataset.h5 and coco_cap_mappings.json? I had assumed that they would be the output of prepro.py from neuraltalk2, but that doesn't seem to be working for me. If you have some preprocessing scripts, a commit would be much appreciated. If not, I am more than happy to contribute if you supply the file structure.

Hi,
The input consists of two h5 files.
One file ('data') contains images, sequences and image indexes of the sequences.
Another file ('feat') contains features of the images.
h5ls 'data':
img Dataset {number of imgs, channel of imgs, height of imgs, width of imgs}
imgid Dataset {number of seqs}
word Dataset {number of seqs, length of seqs}
h5ls 'feat':
feature Dataset {number of imgs, length of features}

Hi,

From DataLoader.lua, there seems to be 3 important files coco_cap_dataset.h5, coco_cap_mappings.json and a feature file that is created in extractFeatFromCNN.lua.

The feature file's contents are known. From your previous comment, coco_cap_dataset.h5 is the 'data file' containing the images, sequences and image indexes of the sequences. (Correct me if I'm wrong)

What does coco_cap_mappings.json contain? Could you also do an h5ls and post it?

Thanks!

Dear lineback,
Would you like to contribute if you supply the file structure at your all site? I would quite appreciate your help!

@cbsudux @Happymarrow sorry for the late reply.
coco_cap_mappings.json contains to mappings {"wtoi": {}, "itow": {}}, which are used for word to index and index to word respectively. You can refer to https://github.com/karpathy/neuraltalk2/blob/master/prepro.py as a reference.