Using RAFT(Recurrent All Pairs Field Transforms for Optical Flow) optical flow algorithm and YOLOv4 object detection algorithm to visualize object's motion, then using DPT(Dense Prediction Transformers) to predict the object's depth.
The optical flow color coding:
Here is the result of the RAFT on video1:
Here is the result with object's depth and motion on video1: (Green arrows are the motion of the objects)
Here is the result of the RAFT on video2:
Here is the result with object's depth and motion on video2: (Green arrows are the motion of the objects)
The DPT architecture:
Reference papers: