OpenGVLab / SAM-Med2D

Official implementation of SAM-Med2D

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Efficiency of the adapter

ariharasudhanm opened this issue · comments

If am not wrong the proposed adapter contains 183M parameters when you compare this with the VIT-B encoder which is composed of 63M params approximately. How can you claim that your adapter is efficient than fine tuning the whole encoder itself?

Firstly, fine-tuning the entire encoder would lead to a degradation of the original ViT's capabilities, so we opted for adapter fine-tuning instead. Secondly, the efficiency during fine-tuning with adapters did not decrease to an intolerable level; for instance, the FPS remained acceptable. Lastly, the adapter layer updates parameters only during the first iteration of each batch, and subsequent iterations do not update them, thus maintaining training efficiency. If you wish to reduce the number of parameters further, you can increase the down-sampling rate, such as to 0.75.