Can anyone simply explain about the sliding window and streaming video?

Question

Can anyone simply explain about the sliding window and streaming video?

sean-wade opened this issue 9 months ago · comments

I observe that in sliding window mode, say T=8, self.num_frame_head_grads = self.num_frame_losses = 2, the for loop will do forward_pts_train without loss calculate in the first 6 times.

So I think this will cost unnecessary calculation cost?

`

def obtain_history_memory:
    ...
    for i in range(T):
        requires_grad = False
        return_losses = False
        data_t = dict()
        for key in data:
            data_t[key] = data[key][:, i] 

        data_t['img_feats'] = data_t['img_feats']
        if i >= num_nograd_frames:
            requires_grad = True
        if i >= num_grad_losses:
            return_losses = True
        loss = self.forward_pts_train(......)

`

Sean · Answer 1 · Tue Nov 07 2023 15:22:37 GMT+0800 (China Standard Time)

After reading the code, I kinda know the reason:
The first 6 loops is used for the head to accumulate memory, and only the last 2 loops is supervised.
Is my understanding right?

exiawsh · Answer 2 · Wed Nov 08 2023 09:38:36 GMT+0800 (China Standard Time)

@sean-wade Sorry for late response. Yoy are right, only the last 2 frames are supervised in sliding window training. The temporal modeling is a recurrent manner in StreamPETR. It's important to mitgate the error propagation. So long window size is necessary.