Clarification on Fewshot results in the paper
muqeeth opened this issue · comments
Hi,
I am trying to replicate the few shot results presented in the paper.
Currently I did the following:
for LR in 0.005
do
for DATASET in food-101 oxford_pets stanford_cars oxford_flowers fgvc_aircraft
do
for SHOT in 8
do
for SEED in 0 1 2
do
python supernet_train_prompt.py --data-path=./data/${DATASET} --data-set=${DATASET}-FS --cfg=experiments/NOAH/subnet/few-shot/ViT-B_prompt_${DATASET}_shot${SHOT}-seed0.yaml --resume=${CKPT} --output_dir=saves/few-shot_${DATASET}_shot-${SHOT}_seed-${SEED}_lr-${LR}_wd-${WEIGHT_DECAY}_noah --batch-size=64 --mode=retrain --epochs=100 --lr=${LR} --weight-decay=${WEIGHT_DECAY} --few-shot-shot=${SHOT} --few-shot-seed=${SEED} --launcher="none"
done
done
done
done
I used resume ckpt as vit16 pretrained model. I get average seed accuracy of 67.36 (0.49) on food101 dataset. From the figure 4, NOAH score on food101 is above 70. I have few questions:
- Is the accuracy in the figure 4 the mean of different seeds?
- Is setting the resume checkpoint as vit16 pretrained the right way?
- Or should I train a supernet model and use it as resume checkpoint when we retrain the subnet model? I guess if I do this, I need not do the search again since you already configured the right dimensions for each prompt module inside the experiments directory.
Thank you for the catch.
In food101 (shot 4,8,16), we find that the searched subset performs much better (in shot 8, 71.5 on average) if it inherits weights from its supernet rather than retraining. So we report this performance on the paper. But, in other datasets, we do not find this phenomenon
Thanks for clarifying @davidzhangyuanhan. Can you help with replicating 71.5 on food 101? I am planning to do the following:
- Train supernet on food 101 dataset using the following command.
python supernet_train_prompt.py --data-path=./data/${DATASET} --data-set=${DATASET}-FS --cfg=${CONFIG} --resume=${CKPT} --output_dir=./saves/few-shot_${DATASET}_shot-${SHOT}_seed-${SEED}_lr-${LR}_wd-${WEIGHT_DECAY}_supernoah --batch-size=64 --lr=${LR} --epochs=300 --weight-decay=${WEIGHT_DECAY} --few-shot-seed=${SEED} --few-shot-shot=${SHOT} --launcher="none"
where CKPT=pretrained vit16, CONFIG=./experiments/NOAH/supernet/supernet-B_prompt.yaml, LR=5e-4, WEIGHT_DECAY=0.0001, SHOT=8, DATASET=food-101
- Run eval using the optimal architecture you already found and written in config with using above trained weights as resume checkpoint.
I am guessing I need not do search since the optimal architecture configuration is already provided in the config.
Is that right? or should I retrain the architecture with weights initialized from the supernet trained above?
Please let me know if I am doing anything wrong.
Yes, you are correct.
I'd like to share with you the (seed-0) checkpoint of the supernet and the evaluation log for reference.
Thank you!!