PSA Do not train with multiple gpus
mlinmg opened this issue · comments
Just for anyone interested in training this model. Do not use multiple gpus, it shows massive performance dump.
For instance, I was getting 1.49s/i with two gpus at batch size 12, and with one gpu I get 5.53it/s
I thought It might help someone in the future