partial_fc_amsoftmax memory leak

Question

partial_fc_amsoftmax memory leak

tianxingyzxq opened this issue 3 years ago · comments

tianxingyzxq commented 3 years ago

partial_fc_amsoftmax memory leak @CoinCheung

tianxingyzxq · Answer 1 · Thu Dec 09 2021 18:48:15 GMT+0800 (China Standard Time)

RuntimeError: CUDA out of memory, after several steps, have you test it?

tianxingyzxq · Answer 2 · Thu Dec 09 2021 18:55:39 GMT+0800 (China Standard Time)

my torch vision is 1.8.1, optimizer is AdamW. sgd has the same problem

CoinCheung · Answer 3 · Thu Dec 09 2021 19:26:22 GMT+0800 (China Standard Time)

It is not likely to be leakage problem since I only use pytorch native operators(I did not write cuda kernels for it). Besides, I myself is training my model with it now, and I observe no such problem.

It is also not likely to be problem of optimizers. Would you please provide detail configuration of your platform and description of how to reproduce it?

CoinCheung · Answer 4 · Thu Dec 09 2021 19:30:26 GMT+0800 (China Standard Time)

Have you tried to use a small batch size and did you observe memory usage is increasing every iteration？

tianxingyzxq · Answer 5 · Fri Dec 10 2021 17:39:10 GMT+0800 (China Standard Time)

I try to reproduce it, and find there is no memory leak at all. My origin train script is wrong. sorry to bother you.