facebookresearch / fairscale

PyTorch extensions for high performance and large scale training.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any examples using AdaScale with fairseq?

kedarkolluri opened this issue · comments

Hi, Are there any examples using AdaScale with fairseq?

Good question. I don't think adascale will work with fairseq since fairseq mostly uses Adam. AdaScale only works with SGD at the moment. If you know how to make it work with Adam, I'd be very happy to know. Also, if you meant to ask about fairseq + SGD, please let me know. I can take a look how to make fairseq work with adascale + SGD, it should be straightforward.