facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

Home Page:https://co-tracker.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to track specified point in the video from the webcam?

newforrestgump001 opened this issue · comments

Is it possible to track the point without putting all the frames as a batch. Thank you a lot.

Hi @newforrestgump001, sorry for the late response. Yes, you can check out the online demo.

Is this what you are looking for?

@nikitakaraevv Sincere thanks for your response. Suppose a video with 100 frames, First, give 5 frames as input of network and then gives output; and the output updates when more frames given to the network frame by frame. In other words, does it support tracked result updating frame by frame.

Hi @newforrestgump001, yes, that's how this demo works, except for the result is currently updated every four frames, not every frame.

@nikitakaraevv Thanks a lot, I got it.

Hi @nikitakaraevv Could we achieve real-time inference by conducting the inference process every four frames? In other words, would it be feasible to perform inference on frames one to four, and then repeat the process on frames five to eight, and so forth? Additionally, would it be necessary to reinitialize the model for each set of four frames? Lastly, could you please share any examples of C++ deployment code that utilizes TensorRT? Your response would be greatly appreciated.

Hi @lutianye, yes, that's exactly how the online demo works: it initialises the model, waits until it has access to the first 8 frames (the sliding window size is 8 frames, but the step is 4 frames), and then processes 4 new frames at a time while always taking only 8 frames as input at every step.
The model is not reinitialised after every step.

Unfortunately, we don't yet have any examples of C++ deployment, but if you let me know what your use case is, we might be able to help you. You can leave your email address or contact me via nikita@robots.ox.ac.uk, and we can discuss it in more detail.

Hi @nikitakaraevv Thank you for your prompt response. I understand what you mean. Our objective is to capture images from a camera while simultaneously performing tracking, thus we cannot afford to feed all frames to the model. Instead, we need to capture and input frames in real-time. I am considering developing a C++ demo leveraging TensorRT, however, I am concerned that TensorRT might not support certain operators. Please forgive me for the delay in my response.My email address is ganzhijie@gmail.com.

Hi, any update for TensorRT? Thanks