Running Detection Model every k frames

Question

Running Detection Model every k frames

HarshPathakhp opened this issue 5 years ago · comments

Hey!
Thanks for your work on the repo.
My issue is as follows : Currently, on the original sort repo, the author mentions to run the detection model in every frame. Hungarian algorithm is then run to find matching between detection output of current frame and predicted tracking output of previous frame (the second part uses Kalman Filter).
If however, detection is run every k frames, and during detection at current frame a new object is detected (for which we create a new tracker), then currently there is a high uncertainty over how that bounding box will move(since we just created a new tracker for it), and the outputs should be noisy for the subsequent k - 1 frames.
Can you please share your insight on this?

Also, I have another question. Do you think that using Kalman Filter is really necessary? What if we run the detection model every frame, and run hungarian algorithm between detection output at frame i+1 and frame i to find the associations? Since we are not skipping frames and objects will not show any sudden movement between frame i and frame i+1, this approach should work.

Thanks once again for your effort on this.

linzai · Answer 1 · Sat Jul 06 2019 10:32:31 GMT+0800 (China Standard Time)

Hi ,@HarshPathakhp ! Thank you for your question!

In my opinion,first of all, under normal circumstances, most of the face movement is relatively uniform, regular and slow, and the acceleration and direction change little in a short period of time, so the prediction of kalman filtering is roughly consistent with the face movement in most cases.However, the value k should not be too large, otherwise it will be difficult for the face prediction box and the actual position of the face to overlap, leading to tracking failure.The ideal value for the value k is between 1 and 5.

For the second problem, first of all, face detection is uncertain. In the process of face movement, there may be missed detection due to low score or occlusion, which will lead to the interruption of tracking.However, the movement of human faces is regular, and kalman filter can connect the previous frame to the next by prediction, thus maintaining the continuity of tracking.Secondly, most face detection algorithms are time-consuming, and the interval work of detection and kalman filtering prediction algorithms can save computing resources.

Harsh Pathak · Answer 2 · Fri Jul 12 2019 14:31:40 GMT+0800 (China Standard Time)

This answers my question. Thank you for your time!