Question about unlabeled data

Question

Question about unlabeled data

johnson111788 opened this issue 2 years ago · comments

Hi

Thanks for your excellent work. I am applying your code to my dataset with one category, that is, the image will be either positive (with lesions) or negative (without a lesion).

I would like to know if should I deem these negative images as unlabeled data or abandon these negative images during training.

Best,
Johnson

Yu-Cheng Chou · Answer 1 · Wed Nov 02 2022 10:07:53 GMT+0800 (China Standard Time)

By the way, there are some bugs in train_retinanet.py during evaluation, including

missing augment `pure_retina' for forward() in coco_eval.py
wrong coco_eval.stats' indexing by metric_names[i]'

LLYXC · Answer 2 · Wed Nov 02 2022 10:14:06 GMT+0800 (China Standard Time)

Hi Johnson,

Thanks for your interest. I think you can try the following options:

If you abandoned the negative images, then should use the current version. As I previously did work on multi-label problems, there won't be too many images abandoned, but if you have many negative images, I would suggest you try option 2 or 3;
If you treat the negative images as unlabeled data, you might gain more supervision from the consistency loss, and there would be some performance gain. The risk is that you might likely have more false positives;
You can also modify the loss so that the supervised retinanet loss will still function for the negative images, and from my experience, you will have less false positives.

Also, many thanks for pointing out the flaws! I will upgrade the code later.

Best,
Luyang