GPU training problem
nightmareisme opened this issue · comments
幽朲 commented
Are the weights trained by 2 gpus different from those trained by 8 gpus in downstream tasks?? Because the overall batch size is different. Hope to get a reply.
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021 Oral.
nightmareisme opened this issue · comments
Are the weights trained by 2 gpus different from those trained by 8 gpus in downstream tasks?? Because the overall batch size is different. Hope to get a reply.