HobbitLong / CMC

[ECCV 2020] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

index drawn by AliasMethod is not on the same gpu as the model

dongyaoli10x opened this issue · comments

not sure if I missed something but it seems to me that if you train on multiple gpus with current implementation, the AliasMethod puts the index on default gpu. The memory_l and memory_ab are on the correct gpu using the register_buffer. Then the torch.index_select(self.memory_l, 0, idx.view(-1)).detach() would gives arguments are located on different GPUs error.

ok now I figured out why. In the current implementation, only encoder is put into DataParallel. Contrast is not in DataParallel. So the loss computation happens only in one GPU. This renders the register_buffer of the memory bank useless. If put contrast into DataParallel, it won't put AliasMethod in the correct gpu. Probably the right way to go is DDP like you implemented in PyContrast