yaoyao-liu / meta-transfer-learning

TensorFlow and PyTorch implementation of "Meta-Transfer Learning for Few-Shot Learning" (CVPR2019)

Home Page:https://lyy.mpi-inf.mpg.de/mtl/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

validation acc in the pretrain phase in pytorch

Sword-keeper opened this issue · comments

I renumber you supported your best pretrained model in a issue. And its validation acc is 64%. I want to modify your backbone. However, the best val acc in my pretrain phase is 41%. And I rerun your pretrain code. I found the best val acc is 48%. So, did you have some tricks when you pretrained the model?

That model is trained using exactly the same code in the GitHub repository.

Please provide me with more information so that I might give you further suggestions. E.g., how you process the dataset, and what is your PyTorch version.

image
Firstly, when I use your supported 'max_acc.pth‘ to run your meta phase, it will out of memory in the val phase. When i added 'with torch.no grad()'. It can run smoothly. What's more, the test acc is equal to your result. It(out of memory) also happened in the pre-val phase when I run the pretrain phase. So I just changed your preval forward code like this.

You should not add with torch.no grad() as we need to calculate the gradients with torch.autograd.grad.

May I know what GPU you’re using?

my torch version is 1.3.1 and data preprocess is same to you

my gpu is gtx 2080, 8g
I put the this part(calculate the gradients with torch.autograd.grad.) in the optimize_base() part. And it is before the with torch.no grad(). Did i do something wrong?

In your screenshot, you use a function named self.base. I guess it is a function added by you. Could you please provide me with the details of that function?

Other parts of your code look correct. If you cannot use meta validation during the pre-training phase, you may use a normal validation for 64 classes instead. You may also try the pre-training code in DeepEMD and FEAT. We're using the same pre-training strategy.

oh self.base is baselearner in your code. I will rerun this code once more. And try other ways. Thank you!

It seems your change is correct. I am not sure what makes your pre-training accuracy lower than excepted. It should be around 60% for meta validation after pre-training. I'll check the related code to find if there is any bug.

I also suggest you run exactly the same code using our config (PyTorch 0.4.0) if it is possible. You may also try the other two methods I mentioned. They all provide the pre-training code.

When I use rtx2080 run your code in torch 0.4.0, there were some bugs in the baselearner.
net = F.linear(input_x, fc1_w, fc1_b) the bugs shows cublas runtime error: the GPU program failed to execute.
I tried to fix it in many ways but failed. However, when I run your code in rtx1060, it succesed. So i updated the pytorch, and it can run again. Maybe there are something wrong between rtx2080, cuda version and torch version. If someone also have this problem, you can tell them to change the gpu or torch version or cuda version.

Thanks for reporting this issue.