Optimizer state dict for RL fine-tuning
aleSuglia opened this issue · comments
Alessandro Suglia commented
Hi,
when you start the RL fine-tuning for image-captioning, do you load the optimiser weights that you used during the SL phase or you start with a brand-new optimiser that you use just for the RL training phase? In your current codebase I can see that you completely disabled the optimiser state checkpointing.
Many thanks,
Alessandro
Luowei Zhou commented
@aleSuglia No, it's a brand-new optimizer.