Confused about the statement of realtime?

Question

Confused about the statement of realtime?

tteresi7 opened this issue a year ago · comments

If I try to run this script, it requires me to run the video once to obtain annotations and then run again. This is offline. What is the script to automatically detect and run everything from a video?

alaamaalouf · Answer 1 · Tue Aug 22 2023 21:20:53 GMT+0800 (China Standard Time)

You have two options,

to automatically detect: you have to first run
"python annotate_features.py --desired_height 240 --desired_width 320 --queries_dir --path_to_images "

where path_to_images is a path to a few images you can annotate. You will click on a few objects and assign each one a label. Indeed, the more (i) you click on a similar object with the same label in diverse scenarios, and (ii) the more labels (objects) you annotate --> the better the results in the next step.
This process does not train any network, it just computes feature descriptors for the object you clicked on.

After that, you can run "python follow_anything.py --desired_height 240 --desired_width 320 --path_to_video --save_images_to outputs/ --detect dino --use_sam --tracker aot --queries_dir --desired_feature <desired_label> --plot_visualizations"

where --queries_dir is where you store the annotation in the previous command, and --desired_feature is the object you wish to detect, you can apply --desired_feature more than once for multiple objects.

Note --class_threshold flag: threshold below which similarity scores are assigned as not the desired class.

to detect via bounding box or clicks from online video stream you can run:
"python follow_anything.py --desired_height 240 --desired_width 320 --path_to_video --save_images_to outputs/ --detect <choose one from click/box> --use_sam --tracker aot --plot_visualizations"

Thomas · Answer 2 · Tue Aug 22 2023 21:24:40 GMT+0800 (China Standard Time)

@alaamaalouf yes I understand this, but I don't understand the claim of real time. Both of these processes are offline and/or require manual intervention. If someone has to stop and look for an object and click on it, or annotate the video before tracking it, it's lost all its online capabilities. Unless I'm missing something entirely.

alaamaalouf · Answer 3 · Tue Aug 22 2023 21:32:50 GMT+0800 (China Standard Time)

Thanks for your interest.

In the auto-detection, you do not annotate the video.

*You annotate other images by clicking on the objects to provide the click queries. It is not the video you want to apply the online following on.

It is a directory containing a set of "m" images from another video/directory/internet that you provide a few clicks on them.
Then, the video you run the tracking on can be any other video with a similar object in it, or from an online stream e.g., --path_to_video rtsp://192.168.144.10:8554/H264Video

in the bounding box or clicks on the desired objects-- no one stops the video. The video is received in an online stream and you click/provide a bounding box on the online stream received.

Thomas · Answer 4 · Tue Aug 22 2023 21:58:47 GMT+0800 (China Standard Time)

@alaamaalouf thanks so much, that's much clearer.