KD issues
lbin opened this issue · comments
If I use res50(34.9) as teacher and res18(30.2) as student, and also train with 140 epoch, would I get a better result(>31.0 mAP on coco)?
My best result is res50(34.9)+res18(30.6) with 140 epoch, kd can only get 0.4% improved
Maybe you can get a better result.
Sadly, I have no time on this experiment for lack of time and GPU.
Another man on zhihu told me that he can get 2% map improvement on mobilenet based on his custom dataset after using my KD code in centernet.
In the beginning, I realized this little improvement too,so I experiment on ImageNet scratch KD and multi-teacher KD,they also performed well compared to baseline.
By the way,I will explore more stronger KD method in centerX_v2(which modified from ttfnet), more KD method will be released in centerXv2 if I succeed KD experiment.