FlagAI-Open / FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question]: Why not call optimizer.zero_grad() in train_step_pytorch and train_step_pytorchDDP function?

THU-Kingmin opened this issue · comments

Description

为什么 代码flagai/trainer.py中 的train_step_pytorch和train_step_pytorchDDP函数在fp32 的训练下没有调用optimizer.zero_grad()?

flagai/trainer.py 的 704行 和778行被注释掉了,如下:

704行:# optimizer.zero_grad()

778行:# model.zero_grad()

请问为什么每一步optimizer.step之后不调用optimizer.zero_grad?这是什么trick吗?还是bug?

非常感谢和期待您的回复!!!

Alternatives

No response