Number of gpus / total batch size to reproduce the results in the paper
tomato18463 opened this issue · comments
Hi,
Thanks for sharing your code. If my understanding of the code is correct, the size of batch is actually related to the number of gpus for training. If I want to get a good result (i.e. reproduce the result in the paper), how many gpus I need? How many gpus are used to get the result in the paper?
Thank you!
Hi @tomato18463, I used 32 GPUs (A100-40GB) for 23-h/438-h experiments and 64 GPUs for 3448-hour experiments. (#9)