bo-miao / SgMg

[ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training time

itruonghai opened this issue · comments

Hi, thanks for your great work. I would like to ask about the training, for both pretrainning and finetuning. In referformer, it takes 2 days and 32 V100 GPUs for pre-trainning. How about SgMg?

Hi,

Thank you for your interest in our work!

If 2 RTX 3090 and VIdeoSwin-T are used, the pretraining on RefCOCOs takes <11 hours for each epoch (<5.5 days for the whole pretraining), and the finetuning on Ref-YouTube-VOS takes about <4.5 hours for each epoch (<27 hours for the whole finetuning).

Therefore, for 2 RTX 3090 and VIdeoSwin-T, the whole pretraining+finetuning will cost about 6.5 days.