yaoyao-liu / meta-transfer-learning

TensorFlow and PyTorch implementation of "Meta-Transfer Learning for Few-Shot Learning" (CVPR2019)

Home Page:https://lyy.mpi-inf.mpg.de/mtl/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

a question about task num in meta-training phase

Sword-keeper opened this issue · comments

nice code! But I have a question about the task number set in the meta-training phase. In the paper,sec 5.1, I noticed that you sample 8k tasks in the meta-training phase. But in your pytorch code, you set the num is 100. So i wonder that whether the code's set's performance is lower than the paper's set.
Besides, I was trying to add something on your model. However,be limited by my device, follow the paper's set (6k tasks)is hard for me. Do you have the performance record of code'set (100 tasks) ? If you have, please send me. So I can compare my model correctly. Thank you!

Hi @Sword-keeper,

Thanks for your interest in our work.
The PyTorch implementation is built based on the open-source code for FEAT. So we follow its settings. If you hope to reproduce the experiments in the CVPR paper, please use the TensorFlow version.

The performance for PyTorch code on miniImageNet is as follows,

pre_train on 5way 1shot : 0.5777
meta_train:
5way-1shot:
Val Best Epoch 94, Acc 0.6561, Test Acc 0.6183
Test Acc 0.6183 + 0.0079
5way-5shot:
Val Best Epoch 52, Acc 0.8108, Test Acc 0.7810
Test Acc 0.7810 + 0.0062

For more details, you may refer to this issue.

Hi I get another problem when i do my experiment. my gpu's memory is 8g . And when i run my model (added something on your model), it caused 'out of memory'. And I found that if I set the query shot 15 to 10 or lower, it can run on my gpu. And my question are follows:

  1. I think the query shot num just like the 'batch size' in the common training(batch size = 5*15). Is that right?
  2. Is this setting(query shot =15) a common setting in the fsl? If i set a lower query shot num, the result is still reliable?

Hi @Sword-keeper,

  1. The number of query samples for an episode is not exactly the same with "batch size". But they have some similar properties, e.g., using more query samples during meta-training can improve the performance and training stability. For more details, please refer to the original paper;

  2. For this setting (query shot = 15), we follow MAML. I have seen many related papers use the same setting.

My suggestion is as follows,
You may change the hyperparameter "query shot" to a relatively small value, as long as it can achieve a similar performance reported in the paper. Do not set the query shot to a very small value, e.g., 1 or 2. I think it is not reasonable to use such small values as it will make the meta-training loss with lots of randomnesses.

thank you for your suggestions!