traing convergence
yangyyt opened this issue · comments
Wall.E commented
The effect of multi-gpus training is not as good as that of single-card training, and it feels that multi-card training is quickly overfitted.
Zhikang Niu commented
Maybe every gpu's codebook weight different?
you can try this code?
encodec-pytorch/quantization/core_vq.py
Line 157 in c6b6de9