BatchNormsync with Adam Optimizer

Question

BatchNormsync with Adam Optimizer

tarun005 opened this issue 6 years ago · comments

Is the bnsync code written specifically for SGD optimizer? The loss is not converging if I use and train the model with Adam optimizer.

Duo Li · Answer 1 · Sat Dec 08 2018 08:42:38 GMT+0800 (China Standard Time)

@tarun005 Have you tested with SGD optimizer? Does it drive the training process to convergence?

Tarun K · Answer 2 · Sun Dec 09 2018 17:51:43 GMT+0800 (China Standard Time)

Yes, the model converges with SGD, but same model does not if I replace SGD with adam.

Duo Li · Answer 3 · Sun Dec 09 2018 18:17:19 GMT+0800 (China Standard Time)

@tarun005 Although I suppose that BN should be irrelevant to the optimization method, when I used the syncbn by just adding the folder lib to $PATH, I met an error saying 'segmentation fault'. What's your usage?

Tarun K · Answer 4 · Tue Feb 26 2019 14:42:01 GMT+0800 (China Standard Time)

Agree that BN shouldn't be relevant to optimization method, but I have read somewhere that Adam requires global statistics at every iteration, so the implementation of BNsync given here could be an issue.

Jakub Langr · Answer 5 · Wed Dec 25 2019 05:23:56 GMT+0800 (China Standard Time)

Hi were either of you @tarun005 @d-li14 be able to upload the models to e.g. dropbox as they don't seem to be accessible on the Princeton site. That'd be awesome!