Comparison Against Adam

Question

Comparison Against Adam

JinLi711 opened this issue 5 years ago · comments

Is it possible for you to benchmark your implementation of AdamW against Tensorflow's implementation of Adam on multiple datasets? It would be useful information for users to decide whether AdamW is the right choice. I would be interested in the differences in the time it takes for every epoch step.

John Muradeli · Answer 1 · Tue Oct 29 2019 19:33:19 GMT+0800 (China Standard Time)

Already done - see build logs based on tests, in particular test_control() (example below). Testing isn't exhaustive, but on both sparse and dense tensors, for a tiny model, the AdamW and TF implementations are about equally fast - and on local tests, same held for a medium model.

Edit: agreeably more benchmarks could be useful, which I may implement in the future.