Questian about the choice of base-learner

Question

Questian about the choice of base-learner

Sebastian-X opened this issue 5 years ago · comments

You used ResNet-12 as base-learner, and it's also a common choice in recent works. Does it mean that ResNet-12 is a super efficient model for few-shot learning? Is there any paper talks about it? I go throw your paper's related citations, but don't really find any information about this.
Also I see you deployed a ResNet version MAML during experiments whose performance over took the original one's, did you just change the base-learner of MAML and remain other parts the same?

P.S. I like your paper, really intriguing.
${S}$YVSX7~(LOQ78_D@D`I8$

Yaoyao Liu · Answer 1 · Thu Nov 21 2019 19:14:00 GMT+0800 (China Standard Time)

Thanks for your interest in our work.

Answer to Q1: ResNet-12 is an example of deeper networks compared to 4CONV. It is not the most efficient network architecture. I use ResNet-12 in my paper for fair comparisons with the related works. If you'd like to read some papers on the network architecture of few-shot learning, I suggest this one: A Closer Look at Few-shot Classification.

Answer to Q2: We have provided ablative results for MAML on ResNet-12 in the paper. However, it is not the result in the image you attached. The result in the image is for the "MAML+HT" setting, where HT meta-batch is applied. You may find the details in the paper.

If you have any further questions, feel free to add comments.

Sebastian · Answer 2 · Sat Nov 23 2019 16:24:48 GMT+0800 (China Standard Time)

Thanks for your response!
I see your paper's ablation experiments, but there's still a point I don't understand. If I did't get it wrong, during meta-transfer learning phase, parameters of feature extractor are fixed while parameters of FC and SS layers are updated. However, the last 2 rows of this table shows the results of SS[Θ4;θ] and SS[Θ;θ] whose notation seems to indicate that parameters of feature extractor Θ4/Θ are also fine tuned. I'm a little confused about this, and if i misunderstood this table could you please tell me the difference between SS[Θ4;θ] and SS[Θ;θ]?

Yaoyao Liu · Answer 3 · Sun Nov 24 2019 14:27:30 GMT+0800 (China Standard Time)

SS[Θ;θ] means that we update SS weights for all convolutional layers Θ and the last fully-connected layer θ;
SS[Θ4;θ] means that we update SS weights for the 4th residual block Θ4 of ResNet-12 and the last fully-connected layer θ.

The details are available in the extended version: https://arxiv.org/pdf/1910.03648.pdf

Sebastian · Answer 4 · Mon Nov 25 2019 19:16:22 GMT+0800 (China Standard Time)

I see. Thank you very much!