Not understanding why you take the last sequences and not the last samples of the sequences
wolhandlerdeb opened this issue · comments
in dpc/model_3d.py
feature_inf = feature_inf_all[:, N-self.pred_step::, :].contiguous()
N is supposed to be the number of sequences, don't we aim to predict the last samples of each sequence?
No. More than the last.
e.g. if the task is 5pred3, we take 8 steps as input. And this line of code is to store the last 3 steps.
Yes I did understand that.
I meant that N is the number of sequences and SL is the length of the sequence, thus why are you taking the features of the self.pred_step last sequences and not the features of the last steps of all the sequences?
Did I get it right?
If I understand correctly this probably helps:
The pipeline is,
input video: [B, N, 3, SL, 128, 128], e.g. [16,8,3,5,128,128]
extract feature z for all B*N samples, get [B,N,C,H,W], e.g. [16,8,256,4,4]
Up to now, we have 16 videos, and have feature map for 8 time steps.
Then we do the 5pred3
or 4pred4
task based on these features.
So we use the first part of them to predict the last part of them (on temporal axis).
Does this clarify your question?
I think I understand. Thanks a lot