Not understanding why you take the last sequences and not the last samples of the sequences

Question

Not understanding why you take the last sequences and not the last samples of the sequences

wolhandlerdeb opened this issue 4 years ago · comments

in dpc/model_3d.py
feature_inf = feature_inf_all[:, N-self.pred_step::, :].contiguous()
N is supposed to be the number of sequences, don't we aim to predict the last samples of each sequence?

Tengda Han · Answer 1 · Fri Mar 13 2020 00:01:59 GMT+0800 (China Standard Time)

No. More than the last.
e.g. if the task is 5pred3, we take 8 steps as input. And this line of code is to store the last 3 steps.

wolhandlerdeb · Answer 2 · Fri Mar 13 2020 00:08:42 GMT+0800 (China Standard Time)

Yes I did understand that.
I meant that N is the number of sequences and SL is the length of the sequence, thus why are you taking the features of the self.pred_step last sequences and not the features of the last steps of all the sequences?
Did I get it right?

Tengda Han · Answer 3 · Fri Mar 13 2020 00:21:07 GMT+0800 (China Standard Time)

If I understand correctly this probably helps:
The pipeline is,
input video: [B, N, 3, SL, 128, 128], e.g. [16,8,3,5,128,128]
extract feature z for all B*N samples, get [B,N,C,H,W], e.g. [16,8,256,4,4]
Up to now, we have 16 videos, and have feature map for 8 time steps.
Then we do the 5pred3 or 4pred4 task based on these features.
So we use the first part of them to predict the last part of them (on temporal axis).
Does this clarify your question?

wolhandlerdeb · Answer 4 · Fri Mar 13 2020 00:29:57 GMT+0800 (China Standard Time)

I think I understand. Thanks a lot