Input file structure
lineback opened this issue · comments
Could you please describe the structure of coco_cap_dataset.h5
and coco_cap_mappings.json
? I had assumed that they would be the output of prepro.py
from neuraltalk2, but that doesn't seem to be working for me. If you have some preprocessing scripts, a commit would be much appreciated. If not, I am more than happy to contribute if you supply the file structure.
Hi,
The input consists of two h5 files.
One file ('data') contains images, sequences and image indexes of the sequences.
Another file ('feat') contains features of the images.
h5ls 'data':
img Dataset {number of imgs, channel of imgs, height of imgs, width of imgs}
imgid Dataset {number of seqs}
word Dataset {number of seqs, length of seqs}
h5ls 'feat':
feature Dataset {number of imgs, length of features}
Hi,
From DataLoader.lua
, there seems to be 3 important files coco_cap_dataset.h5
, coco_cap_mappings.json
and a feature file that is created in extractFeatFromCNN.lua
.
The feature file's contents are known. From your previous comment, coco_cap_dataset.h5
is the 'data file' containing the images, sequences and image indexes of the sequences. (Correct me if I'm wrong)
What does coco_cap_mappings.json
contain? Could you also do an h5ls and post it?
Thanks!
Dear lineback,
Would you like to contribute if you supply the file structure at your all site? I would quite appreciate your help!
@cbsudux @Happymarrow sorry for the late reply.
coco_cap_mappings.json contains to mappings {"wtoi": {}, "itow": {}}, which are used for word to index and index to word respectively. You can refer to https://github.com/karpathy/neuraltalk2/blob/master/prepro.py as a reference.