Label propagation: predictions before context has burned in
vadimkantorov opened this issue · comments
@ajabri Could you please explain how results are filled for first n_context = 20
frames? Are they copied from ground truth? The paper suggests that the ground truth is only used for the 1st frame, but I can't find where predictions for 2nd-20th frames are filled in. Are they filled in as background?
From what I could see, predictions affect lbls
only after n_context
frames https://github.com/ajabri/videowalk/blob/0834ff9/code/test.py#L144-L148:
if t > 0:
lbls[t + n_context] = pred
else:
pred = lbls[0]
lbls[t + n_context] = pred
For DAVIS evaluation, the frames are saved at index t
and not t + n_context
https://github.com/ajabri/videowalk/blob/0834ff9/code/test.py#L168:
outpath = os.path.join(args.save_path, str(vid_idx) + '_' + str(t))
Are these 2nd-20th frames included in error metric evaluation? and what prections are used for these frames?
Thanks, @ajabri !
Hi @vadimkantorov, thanks for the question. Sorry the code is a bit confusing.
lbsl
should have T + n_context
label maps. The first n_context
are the first frame's labels, copied. This is just to make the implementation simpler. As we propagate labels, we put the predicted label maps back into lbls
, to satisfy the recurrence.
Only the last T
label maps are dumped to file. So whereas the file path is saved at index t
, the data that is dumped, named pred
, is actually lbls[t + n_context]
.
Does this make sense?
Ah, I see! So VOSDataset
would insert the first frame copied 20 times in the queue, right?
Yes