freeze the non-temporal parameters
nankepan opened this issue · comments
Hi, I have a question. Should I freeze the non-temporal parameters during training? Thank you!
Hello,
We have conducted an experiment on this subject and are happy to share our findings. In summary, we recommend against freezing non-temporal parameters during training.
Initially, we froze the non-temporal parameters and observed that the generated videos were overly static and unsatisfactory. This issue likely stems from the reduced number of trainable parameters, which impairs performance.
Subsequently, we attempted to train all parameters after initially training only the temporal ones. However, the results were inferior compared to training from scratch.
Based on our experiments, we advise against freezing the non-temporal parameters. Attempts to freeze text-related parameters were also unsuccessful. Overall, our recommendation is to train all parameters for optimal results.
I see that in your code, you freeze the temporal by disabling the gradient. Won't that stop the gradient from flowing to the other non-freeze-blocks in backpropagation?
The ckpt only provides DiT weights, is this ckpt trained with text/vae frozen? When will the fully-trained weights released?