offline training details
j50888 opened this issue · comments
Hi,
May I ask you for offline training detail?
Does the learning rate and momentum same as online training(1e-8, 0.9)?
And which supervision was used?
And which initialization method did the variables not in VGG-16(score-dsn_2.score-dsn_2-up..etc) took?
BTW,how to choose which supervision to use?(ex. why only use main loss when online training in provided demo.py)
Thanks!
Hello,
The offline training is done in the file "osvos_parent_demo.py", here is a summary of the training details:
-
The momentum is the same in online and offline training (0.9).
-
Regarding the learning rate and the supervision to train the parent network, the following strategy is done :
-10K iteration with supervision 1 and learning rate 1e-8
-5K iteration with supervision 1 and learning rate 1e-9
-10K iteration with supervision 2 and learning rate 1e-8
-5K iteration with supervision 2 and learning rate 1e-9
-10K iteration with supervision 3 and learning rate 1e-8
-10K iteration with supervision 3 and learning rate 1e-9
The supervision is the third argument of both training functions "train_parent" and "train_finetune". It can take the following values:
- 1 for strong supervision. Loss of side ouputs multiplied by 1.
- 2 for weak supervision. Loss of side outputs multiplied by 0.5.
- 3 for no side supervision. Loss of side outputs multiplied by 0.
The weights are initialized using "tf.random_normal_initializer(stddev=0.001)" and the biases are initialized to zero. This is defined in the function "osvos_arg_scope" in "osvos.py".
Let me know if this clarifies your doubts!
Thank you!
I understand now.