Official support for finetune-based methods, e.g., vit adapter, and multi-GPU training with DataParallel.
TimandXiyu opened this issue · comments
The implementation of the gradient update in faa_model.py seems to be very constrainted at best. It does not factor in the case where I want to obtain the policy for a finetuning model, it only naively set all layer's parameter to either require or not require gradient. Besides, this is also problematic if the user want to use DP to enable multi-GPU training.
This repo really needs some updates to address the mentioned issues as well as other issues that has been mentioned in other issues.
@TimandXiyu : can you provide more information related to your issues? It is best if you can provide detailed information related to their implementation. Thanks.