Optimizer state dict for RL fine-tuning

aleSuglia opened this issue · comments


when you start the RL fine-tuning for image-captioning, do you load the optimiser weights that you used during the SL phase or you start with a brand-new optimiser that you use just for the RL training phase? In your current codebase I can see that you completely disabled the optimiser state checkpointing.

Many thanks,


@aleSuglia No, it's a brand-new optimizer.