chenyilun95 / tf-cpn

Cascaded Pyramid Network for Multi-Person Pose Estimation (CVPR 2018)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How about training from scratch?

kaleidoscopical opened this issue · comments

Hi! Thanks for providing such a wonderful work.
I wonder have you tried a ResNet backbone without ImageNet pretraining?
Is it possible that a pre-trained model might become one of the keys of the performance improvement?

I dont tried it. But I think it should hurt the performance. But Google also use the resnet with dilation in coco2016 competition. The performance is also compared in the table in our Paper. Our improvement is based on their result.

Thanks for your kind reply. But, sorry for my curiosity. I still have questions.

I think the result from Google may not be strictly comparable here. Their detection part is too weak and even without an FPN. Although better detection does not lead to better keypoint estimation, their result is even less than the FPN baseline. Besides, there are still several factors that make the comparison unfair, e.g. iteration of training, input size, and training strategy.

May it be fairer that the comparison is made between a no pre-trained CPN(ResNet-50) and a 2-stage hourglass model? In this case, they have same FLOPs and no other prior information.

  1. Whether pre-trained ResNet helps or not, the improvement above it is our contribution.
  2. We compare our model with ResNet and hourglass network under same training configuration in our paper.
  3. Some other guy has tried training from stratch. He said the result was almost same.
    That's my all comments. Thank you.