yaoyao-liu / meta-transfer-learning

TensorFlow and PyTorch implementation of "Meta-Transfer Learning for Few-Shot Learning" (CVPR2019)

Home Page:https://lyy.mpi-inf.mpg.de/mtl/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about the pytorch implementation

bellos1203 opened this issue · comments

Hello, thanks for your marvelous work and the released code for the community.

I have run your pytorch-version code, and I got results for the mini-ImageNet as follows :
5way-1shot : 61.22 +- 0.0089
5way-5shot : 78.05 +- 0.0059

It's a bit higher than the results that you reported in the paper and this repo.
Is the result affected by the pre-trained model with the random seed(I did not try with the different seed yet), or is there any bug in the pytorch version?

Thanks in advance.

Thanks for your interest in our work.

The default network architecture for the PyTorch implementation is a 25-layer ResNet, which is deeper than the network architecture (ResNet-12) used in the original paper and the TensorFlow implementation. So the performance is a little higher.

@yaoyao-liu Hi, I used ResNet12 described in the paper as the network architecture, and got results for the mini-ImageNet as follows:
5-way 1-shot: 56.56 +- 0.87
5-way 5-shot: 72.95 +- 0.64
, which are quite lower than the results reported in the paper. Would you also provide the ResNet12 version of code? Thanks!

Hi @eezywu,

Thanks for your interest in our work. The ResNet-12 implementation is provided in the TensorFlow version (https://github.com/yaoyao-liu/meta-transfer-learning/tree/master/tensorflow).

If you'd like to run ResNet-12 with the PyTorch version, you may use the ResNet-12 provided in MetaOptNet (https://github.com/kjunelee/MetaOptNet).

If you have any further questions, feel free to email me, or leave additional comments on this issue.

Sorry for my late reply.
I didn't notice that the architecture is different!
After I re-implemented the ResNet12 architecture, I got the following results:
mini-ImageNet, 5-way 5-shot: 75.36 +- 0.61

I think the difference comes from the difference between the frameworks, pytorch, and tensorflow or some randomness.
By the way, I think the inner-loop, code from line number 166 to 169 in 'trainer/meta.py', should be modified since the loss is accumulated sequentially among the task_batches, rather than in a parallel manner.

Thank you for your response again.

Hi @bellos1203,

Thanks for reporting your results.

The implementation of the task batches has not been added to the PyTorch version. I'll fix this when I have time.

If you have any further questions on our work, feel free to add more comments, or create a new issue.