ZhangYuanhan-AI / NOAH

Searching prompt modules for parameter-efficient transfer learning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clarification on Fewshot results in the paper

muqeeth opened this issue · comments

Hi,
I am trying to replicate the few shot results presented in the paper.
Currently I did the following:

for LR in 0.005 
do 
    for DATASET in food-101 oxford_pets stanford_cars oxford_flowers fgvc_aircraft 
    do 
        for SHOT in 8
        do
            for SEED in 0 1 2
            do
                python supernet_train_prompt.py --data-path=./data/${DATASET} --data-set=${DATASET}-FS --cfg=experiments/NOAH/subnet/few-shot/ViT-B_prompt_${DATASET}_shot${SHOT}-seed0.yaml --resume=${CKPT} --output_dir=saves/few-shot_${DATASET}_shot-${SHOT}_seed-${SEED}_lr-${LR}_wd-${WEIGHT_DECAY}_noah --batch-size=64 --mode=retrain --epochs=100 --lr=${LR} --weight-decay=${WEIGHT_DECAY} --few-shot-shot=${SHOT} --few-shot-seed=${SEED} --launcher="none"
            done
        done
    done
done

I used resume ckpt as vit16 pretrained model. I get average seed accuracy of 67.36 (0.49) on food101 dataset. From the figure 4, NOAH score on food101 is above 70. I have few questions:

  1. Is the accuracy in the figure 4 the mean of different seeds?
  2. Is setting the resume checkpoint as vit16 pretrained the right way?
  3. Or should I train a supernet model and use it as resume checkpoint when we retrain the subnet model? I guess if I do this, I need not do the search again since you already configured the right dimensions for each prompt module inside the experiments directory.

Thank you for the catch.

In food101 (shot 4,8,16), we find that the searched subset performs much better (in shot 8, 71.5 on average) if it inherits weights from its supernet rather than retraining. So we report this performance on the paper. But, in other datasets, we do not find this phenomenon

Thanks for clarifying @davidzhangyuanhan. Can you help with replicating 71.5 on food 101? I am planning to do the following:

  1. Train supernet on food 101 dataset using the following command.
    python supernet_train_prompt.py --data-path=./data/${DATASET} --data-set=${DATASET}-FS --cfg=${CONFIG} --resume=${CKPT} --output_dir=./saves/few-shot_${DATASET}_shot-${SHOT}_seed-${SEED}_lr-${LR}_wd-${WEIGHT_DECAY}_supernoah --batch-size=64 --lr=${LR} --epochs=300 --weight-decay=${WEIGHT_DECAY} --few-shot-seed=${SEED} --few-shot-shot=${SHOT} --launcher="none"

where CKPT=pretrained vit16, CONFIG=./experiments/NOAH/supernet/supernet-B_prompt.yaml, LR=5e-4, WEIGHT_DECAY=0.0001, SHOT=8, DATASET=food-101

  1. Run eval using the optimal architecture you already found and written in config with using above trained weights as resume checkpoint.

I am guessing I need not do search since the optimal architecture configuration is already provided in the config.

Is that right? or should I retrain the architecture with weights initialized from the supernet trained above?

Please let me know if I am doing anything wrong.

Yes, you are correct.

I'd like to share with you the (seed-0) checkpoint of the supernet and the evaluation log for reference.

Thank you!!