what would be the best way to do multi-gpu training for snca?

Question

pallashadow opened this issue 5 years ago · comments

Shall I do nn.parallel.DistributedDataParallel for for both "model" and "lemniscate" and mannually sync each lemniscate memory per epoch?

I think a better solution might be use a single GPU for "lemniscate" memory and calculation, while other GPUs for data parallel "model" part?

I am doing a comics image retrieval task, and find this project very useful. Thank you for you help.

yghong · Answer 1 · Sat May 22 2021 23:27:07 GMT+0800 (China Standard Time)

Hi, I am also thinking about the memory issue, how did you deal with it?

Yuhao Lu · Answer 2 · Mon May 24 2021 23:05:30 GMT+0800 (China Standard Time)

I put the memory forward and backward on CPU, when the memory became large. it was a little bit slower but worked well.

model_forward_gpu -> memory_forward_cpu -> loss_cpu -> memory_backward_cpu -> model_backward_gpu

yghong · Answer 3 · Tue May 25 2021 12:14:55 GMT+0800 (China Standard Time)

Thanks for sharing!