AssertionError
kimo20200 opened this issue · comments
Hello,
I got an error when I tried to train the model. I have attached a link in the box. Do you have any idea about it?
Thank you.
link: https://drive.google.com/file/d/14cI1BLo7FZQ8IhGBuGoUzsXJtEEtlp65/view?usp=sharing
Best wishes,
Qing
Thanks for the interest in OXnet. I can't find the screenshot anywhere. Would you please try again?
I have updated it, sorry for that. Please see the first box
In the current version of OXnet, we are sampling two types of data in one batch: labeled data and unlabeled data. Hence, you can see the usage of a two-stream batch sampler. The reported error is caused by
Line 532 in 2c99bb5
- the size of the second batch of data (for unlabeled data) should be at least 1;
- the total number of the unlabeled data should be larger than or equal to the second batch size (defined as follows:)
Line 123 in 2c99bb5
Thanks for your reply, I have prepared my own train (labeled) dataset with coco-liked format, but where the unlabeled data I should include. I am confused about it.
You can create two COCO-style json files for labeled data and unlabeled data and merge these two in one single json file. Say you have N labeled data, and M unlabeled data, then the beginning N data in the merged file should be labeled ones and the followed M data are the unlabeled ones. Set N and N+M to be num_labeled_data
and num_data
in the following:
Lines 121 to 125 in 2c99bb5
Note you can give empty or negative values to the annotations for the unlabeled data, these would not be included in the training.
Thanks for pointing this out. I will clarify the instruction.
Very appreciate your clear demonstration and could you show an example of the json file or structure? I think it is easy for users to understand it.
A COCO-style json file would be a dictionary with the structure like follows:
dict_keys(['categories', 'images', 'annotations'])
Say you now have a json file for the labeled data (labeled.json), and a json file the for unlabeled data (unlabeled.json), you can merge these into a single file for training as follows:
import json as js
js_l = js.load('labeled.json')
js_u = js.load('unlabeled.json')
merged = {'categories': js_l['categories'],
'images' : js_l['images'] + js_u ['images'] ,
'annotations': js_l['annotations'] + js_u ['annotations']}
with open('train.json', 'w') as f:
js.dump(merged, f)
Thank you a lot, I will try it later
Hi, when I was training, the Cls Cons loss and Reg Cons loss were always 0. They were calculated by soft focal loss. Does it means I add a incorrect unlabeled dataset?
You may need to pre-train a RetinaNet and use the weights to initialize the teacher model so that the teacher model could generate meaningful pseudo labels to be used in the soft focal loss.
I think the mentioned problem might be that the teacher model does not generate pseudo labels.
Cool. Thank you.