Is sat suuport saving checkpoint by using fp16 or bf16?
xxxwuwq opened this issue · comments
Is sat suuport saving checkpoint by using fp16 or bf16?
Saving checkpoint preserves the original parameter dtype. Do you mean you want to train model with fp32, but save it with fp16? If you train a model with fp16, the model will be save with fp16 by default.
Yes, if support, i can chose what i need, cause that when i using cogvlm to finetune, it only support save checkpoint in fp32, which need 60GB+ storage to save one model, fp16 maybe enough. offical api seems doesn'y support
cogvlm finetune saves model in bf16, unless you train it with fp32.
It is memory consuming to save the model with a different dtype with that you train, because you need a copy of the whole model to complete that.
thanks a lot