Question about the pytorch implementation

Question

Question about the pytorch implementation

bellos1203 opened this issue 5 years ago · comments

Hello, thanks for your marvelous work and the released code for the community.

I have run your pytorch-version code, and I got results for the mini-ImageNet as follows :
5way-1shot : 61.22 +- 0.0089
5way-5shot : 78.05 +- 0.0059

It's a bit higher than the results that you reported in the paper and this repo.
Is the result affected by the pre-trained model with the random seed(I did not try with the different seed yet), or is there any bug in the pytorch version?

Thanks in advance.

Yaoyao Liu · Answer 1 · Thu Nov 28 2019 11:54:09 GMT+0800 (China Standard Time)

Thanks for your interest in our work.

The default network architecture for the PyTorch implementation is a 25-layer ResNet, which is deeper than the network architecture (ResNet-12) used in the original paper and the TensorFlow implementation. So the performance is a little higher.

eezywu · Answer 2 · Thu Dec 05 2019 14:41:06 GMT+0800 (China Standard Time)

@yaoyao-liu Hi, I used ResNet12 described in the paper as the network architecture, and got results for the mini-ImageNet as follows:
5-way 1-shot: 56.56 +- 0.87
5-way 5-shot: 72.95 +- 0.64
, which are quite lower than the results reported in the paper. Would you also provide the ResNet12 version of code? Thanks!

Yaoyao Liu · Answer 3 · Thu Dec 05 2019 15:42:23 GMT+0800 (China Standard Time)

Hi @eezywu,

Thanks for your interest in our work. The ResNet-12 implementation is provided in the TensorFlow version (https://github.com/yaoyao-liu/meta-transfer-learning/tree/master/tensorflow).

If you'd like to run ResNet-12 with the PyTorch version, you may use the ResNet-12 provided in MetaOptNet (https://github.com/kjunelee/MetaOptNet).

If you have any further questions, feel free to email me, or leave additional comments on this issue.

JaeYoo Park · Answer 4 · Wed Dec 18 2019 13:31:38 GMT+0800 (China Standard Time)

Sorry for my late reply.
I didn't notice that the architecture is different!
After I re-implemented the ResNet12 architecture, I got the following results:
mini-ImageNet, 5-way 5-shot: 75.36 +- 0.61

I think the difference comes from the difference between the frameworks, pytorch, and tensorflow or some randomness.
By the way, I think the inner-loop, code from line number 166 to 169 in 'trainer/meta.py', should be modified since the loss is accumulated sequentially among the task_batches, rather than in a parallel manner.

Thank you for your response again.

Yaoyao Liu · Answer 5 · Wed Dec 18 2019 14:43:18 GMT+0800 (China Standard Time)

Hi @bellos1203,

Thanks for reporting your results.

The implementation of the task batches has not been added to the PyTorch version. I'll fix this when I have time.

If you have any further questions on our work, feel free to add more comments, or create a new issue.