Label propagation: predictions before context has burned in

Question

Label propagation: predictions before context has burned in

vadimkantorov opened this issue 3 years ago · comments

@ajabri Could you please explain how results are filled for first n_context = 20 frames? Are they copied from ground truth? The paper suggests that the ground truth is only used for the 1st frame, but I can't find where predictions for 2nd-20th frames are filled in. Are they filled in as background?

From what I could see, predictions affect lbls only after n_context frames https://github.com/ajabri/videowalk/blob/0834ff9/code/test.py#L144-L148:

if t > 0:
    lbls[t + n_context] = pred
else:
    pred = lbls[0]
    lbls[t + n_context] = pred

For DAVIS evaluation, the frames are saved at index t and not t + n_context https://github.com/ajabri/videowalk/blob/0834ff9/code/test.py#L168:

outpath = os.path.join(args.save_path, str(vid_idx) + '_' + str(t))

Are these 2nd-20th frames included in error metric evaluation? and what prections are used for these frames?

Thanks, @ajabri !

A. Jabri commented 3 years ago

Yes

A. Jabri · Answer 1 · Sat Jan 30 2021 06:21:48 GMT+0800 (China Standard Time)

Hi @vadimkantorov, thanks for the question. Sorry the code is a bit confusing.

lbsl should have T + n_context label maps. The first n_context are the first frame's labels, copied. This is just to make the implementation simpler. As we propagate labels, we put the predicted label maps back into lbls, to satisfy the recurrence.

Only the last T label maps are dumped to file. So whereas the file path is saved at index t, the data that is dumped, named pred, is actually lbls[t + n_context].

Does this make sense?

Vadim Kantorov · Answer 2 · Sat Jan 30 2021 06:41:57 GMT+0800 (China Standard Time)

Ah, I see! So VOSDataset would insert the first frame copied 20 times in the queue, right?