zengyan-97 / X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hi, could you provide the specific commands of finetuning on coco captioning? Thanks!

yaolinli opened this issue · comments

I am confused about the "lm_domain_pretrain.th" file in "4m_base_finetune/coco_caption/". If I want to reproduce the fine-tuning results on coco captioning, which pre-trained model should I load: "lm_domain_pretrain.th" or "4m_base_model_state_step_199999.th"? Maybe you could provide the specific commands of two-stage finetuning on coco captioning? Thanks!

Hi,

More examples of running captioning have been updated in readme.