How to fine-tune on sintel

Question

How to fine-tune on sintel

jsczzzk opened this issue 5 years ago · comments

Hi, I'm a little confused about the fine-tune on sintel.

I have trained the Flownet-s on chairs for 600k itrations and then fine-tuned on sintel.

I find the model can not converge when i fine-tune on sintel ,the epe on validation set looks like this:

the validation set that i choose follows this,it contain 320 pairs of images.
data augmentation used in the model are listed as follows:

1.rotation
2.translation
3.Flip Horizontal
4.no scale,because i find that add scale will increase epe on the test set, is it right?
How can i fine-tune on sintel so that i can get the lowest epe on test set?
Thanks.

Nikolaus Mayer · Answer 1 · Thu Mar 21 2019 20:23:43 GMT+0800 (China Standard Time)

What did you use for finetuning? Did you increase the learning rate again as we did in the paper (see "S_fine" schedule)?

Zhongkai Zhou · Answer 2 · Fri Mar 22 2019 11:21:02 GMT+0800 (China Standard Time)

Finetuing schedule i used in my experiment is listed as follows:

I train the FlownetS for 600k iterations using the S_short schedule ,and the final epe on validation set is 2.02.
Then,I choose the following scenes as the validation set as you did in this:
ambush_6, bamboo_2, parts of cave_4, market_6, parts of temple_2;
for parts of cave_4,and parts of temple_2,i add the last 25 frames(from 25 to 49)to my validation set,
so the number of my training set is 1762 ,and the number of validation set is 320(160 for clean,160 for final)
I use S_fine schedule for finetuning,and validate on the validation set every 5000 iterations,the epe on validation set cannot converge as you see above.
When i test one model produced during training on Sintel traing set ,it works fine even though the model cannot converge,epe on the whole training set is listed as follows:

epe on clean is 2.264
epe on final is 2.537
Much better than that in the original paper(clean:3.66,final:4.44),but it work worse in the test set as shown below:
epe on test clean is 7.584
epe on test final is 8.136

It is counter-intuitive as the training epe is much better.
So how can i choose a model to work better in the sintel test set?

Nikolaus Mayer · Answer 3 · Wed Mar 27 2019 16:41:59 GMT+0800 (China Standard Time)

I am afraid I do not really have an answer. The network can definitely train on Sintel, and I do not have an intuition for what specifically may be going wrong with yours.

So how can i choose a model to work better in the sintel test set?

I think if anyone had an answer to this question, the optical flow research community would be very interested in hearing it 😉

Zhongkai Zhou · Answer 4 · Wed Mar 27 2019 17:19:43 GMT+0800 (China Standard Time)

Thank you for your patient explanations:smile:,it is my first step to the optical flow,I'm sorry to have asked you so many questions.
I think it may be caused by the learning rate,I do not know if it was right.When i test the leaning rate in this , my model can really converge, Maybe this learning rate fits my model? The learning rate looks like this:

compared with the S_fine

The epe of validation set on first 150k steps looks like this:

it looks much better than above. As a result, i test the model on the sintel training set.
the epe on the whole training set is:

clean:1.31

final:1.49
It works better than the original flownetS.

But, it works worse in the test set:
clean:6.953
final:7.304

Is it shows my model overfits the sintel training set?

Nikolaus Mayer · Answer 5 · Thu Mar 28 2019 16:19:28 GMT+0800 (China Standard Time)

your learning rate should be ok, even though it's not the same as our schedule.
your validation agrees with this and indicates good convergence
however, your test loss is really extremely good and this could indeed mean that overfitting is happening. I'm not sure what to make of this in combination with your validation loss which looks good.
you write that your network "works worse in the test set", but I don't see how; note that the "+v" networks include a variational refinement on top of the core network output. You want to compare to "FlowNetS+ft".

If you are training a "S" network (and not "s" as you wrote in your first post), then the test loss is actually better than our old numbers.

Zhongkai Zhou · Answer 6 · Thu Mar 28 2019 19:03:05 GMT+0800 (China Standard Time)

Thank you so much, i will take some means to regularize the model.