TengdaHan / DPC

Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What do you mean by sequential prediction in your paper?

sarimmehdi opened this issue · comments

Hello. I read in your paper that you did an ablation study where you removed the sequential prediction and replaced it with parallel prediction where you predicted all three time steps with a different fully connected layer. I understand parallel prediction but I don't understand what you really mean by sequential prediction.

By sequential prediction do you mean that at every time step you use the same fully connected layer to do the prediction or that you use one fully connected layer at the final time step only?

commented

Hi,

I'll use the Figure 2 in the paper to explain this.

As you mentioned, parallel prediction means to predict ẑt+1,t+2, ..., with separate FC layers.

While for sequential prediction, it's simply meaning an autoregressive model, so we use the convGRU. In detail, the context vector ct computed from zt and hidden states from previous step (not shown in the figure), is used to predict the ẑt+1.

Then the predicted ẑt+1, and hidden states are further applied to compute ct+1,. Ideally, now ct+1, will include all the information from t+1 step, and it can be used to predict ẑt+2,