About #params of Adapter

Question

About #params of Adapter

JieShibo opened this issue 2 years ago · comments

Hi, thank you for sharing the awesome code.
After reading the paper, I am a little confused about the usage of Adapter. It seems that the adapters are placed in FFN blocks only, not the same as in Houlsby's paper where the Attn blocks also have adapters. So I think the #params of the 8dim-Adapter should be 12 * (8 * 768 * 2 + 8 + 768) = 0.157M. But Table 1 shows that the #params of Adapter is 0.33M. So I wonder if you also place adapters in the Attn blocks for the baselines.

Yuanhan Zhang · Answer 1 · Sat Jun 11 2022 17:04:34 GMT+0800 (China Standard Time)

Thanks for your reminder, and sorry for this confusion.
We revise this in our newest arXiv version.
Specifically, the Adapter is 0.16M, LoRA is 0.29M, VPT is 0.64M, and NOAH is 0.43M.
For a fair comparison, we will add the VTAB performance of the 4X8 dim Adapter in the newest version. Please stay tuned.

Shibo Jie · Answer 2 · Sun Jun 12 2022 16:14:14 GMT+0800 (China Standard Time)

Thanks a lot for your reply.
BTW it seems that the lib folder has not been uploaded yet, which causes an error when running the scripts.

Yuanhan Zhang · Answer 3 · Tue Jun 14 2022 09:02:53 GMT+0800 (China Standard Time)

Thanks a lot for your reply. BTW it seems that the lib folder has not been uploaded yet, which causes an error when running the scripts.

Yes, Shibo.

Actually, we are cleaning our code in the lib folder these days. We will upload this folder this week, please stay tuned.

Shibo Jie · Answer 4 · Sat Jun 18 2022 17:06:40 GMT+0800 (China Standard Time)

Hi, thanks again for sharing the code.
I successfully reproduced most of the results about VTAB-1K in the paper except for Retinopathy.
I ran the following commands

DATASET=diabetic_retinopathy
#adapter
python supernet_train_prompt.py --data-path=../vtab-1k/${DATASET} --data-set=${DATASET} --cfg=./experiments/Adapter/ViT-B_prompt_adapter_8.yaml --resume=../ViT-B_16.npz --output_dir=./saves/${DATASET}_lr-0.001_wd-0.0001_adapter --batch-size=64 --lr=0.001 --epochs=100 --is_adapter --weight-decay=0.0001 --no_aug --mixup=0 --cutmix=0 --direct_resize --smoothing=0 --launcher="none"
#lora
python supernet_train_prompt.py --data-path=../vtab-1k/${DATASET} --data-set=${DATASET} --cfg=./experiments/LoRA/ViT-B_prompt_lora_8.yaml --resume=../ViT-B_16.npz --output_dir=./saves/${DATASET}_lr-0.001_wd-0.0001_lora --batch-size=64 --lr=0.001 --epochs=100 --is_LoRA --weight-decay=0.0001 --no_aug --mixup=0 --cutmix=0 --direct_resize --smoothing=0 --launcher="none"
#noah
python supernet_train_prompt.py --data-path=../vtab-1k/${DATASET} --data-set=${DATASET} --cfg=experiments/NOAH/subnet/VTAB/ViT-B_prompt_${DATASET}.yaml --resume=../ViT-B_16.npz --output_dir=saves/${DATASET}_supernet_lr-0.0005_wd-0.0001/retrain_0.001_wd-0.0001  --batch-size=64 --mode=retrain --epochs=100 --lr=0.001 --weight-decay=0.0001 --no_aug --direct_resize --mixup=0 --cutmix=0 --smoothing=0 --launcher="none"

and got
{"train_lr": 1.0976769428005575e-05, "train_loss": 0.053501952129105725, "test_loss": 1.5837794360286461, "test_acc1": 71.1038200165646, "test_acc5": 100.0, "epoch": 99, "n_parameters": 160613}
{"train_lr": 1.0976769428005575e-05, "train_loss": 0.02753125037997961, "test_loss": 2.0185347567061465, "test_acc1": 67.27443168466701, "test_acc5": 100.0, "epoch": 99, "n_parameters": 298757}
{"train_lr": 1.0976769428005575e-05, "train_loss": 0.042937366478145125, "test_loss": 1.6862158768191309, "test_acc1": 69.75392547600269, "test_acc5": 100.0, "epoch": 99, "n_parameters": 8462261}

Did I do something wrong?

Yuanhan Zhang · Answer 5 · Sun Jun 19 2022 10:08:38 GMT+0800 (China Standard Time)

Hi Shibo,

I want to know if the shown test_acc1 is the best accuracy OR the accuracy of the final checkpoint?

Shibo Jie · Answer 6 · Sun Jun 19 2022 11:27:52 GMT+0800 (China Standard Time)

The final checkpoint

Yuanhan Zhang · Answer 7 · Sun Jun 19 2022 11:31:29 GMT+0800 (China Standard Time)

Ok, we report the best accuracy.

Shibo Jie · Answer 8 · Sun Jun 19 2022 11:37:23 GMT+0800 (China Standard Time)

Thank you.