Optimizer state dict for RL fine-tuning

Question

Optimizer state dict for RL fine-tuning

aleSuglia opened this issue 4 years ago · comments

Hi,

when you start the RL fine-tuning for image-captioning, do you load the optimiser weights that you used during the SL phase or you start with a brand-new optimiser that you use just for the RL training phase? In your current codebase I can see that you completely disabled the optimiser state checkpointing.

Many thanks,

Alessandro

Luowei Zhou · Answer 1 · Fri May 22 2020 09:10:04 GMT+0800 (China Standard Time)

@aleSuglia No, it's a brand-new optimizer.