VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Xiaoyu Shi, Zhaoyang Huang, Weikang Bian, Dasong Li, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li
ICCV 2023
demo.mp4
Requirements
conda create --name videoflow
conda activate videoflow
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy opencv-python -c pytorch
pip install yacs loguru einops timm==0.4.12 imageio
Models
We provide pretrained models. The default path of the models for evaluation is:
├── VideoFlow_ckpt
├── MOF_sintel.pth
├── BOF_sintel.pth
├── MOF_things.pth
├── BOF_things.pth
├── MOF_kitti.pth
├── BOF_kitti.pth
Inference & Visualization
Download VideoFlow_ckpt and put it in the root dir. Run the following command:
python -u inference.py --mode MOF --seq_dir demo_input_images --vis_dir demo_flow_vis
If your input only contain three frames, we recommend to use the BOF model:
python -u inference.py --mode BOF --seq_dir demo_input_images_three_frames --vis_dir demo_flow_vis_three_frames
Data Preparation
To evaluate/train FlowFormer, you will need to download the required datasets.
- FlyingChairs
- FlyingThings3D
- Sintel
- KITTI (multi-view extension, 20 frames per scene, 14 GB)
- HD1K
By default datasets.py
will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets
folder
├── datasets
├── Sintel
├── test
├── training
├── KITTI
├── testing
├── training
├── devkit
├── FlyingChairs_release
├── data
├── FlyingThings3D
├── frames_cleanpass
├── frames_finalpass
├── optical_flow
Training
The script will load the config according to the training stage. The trained model will be saved in a directory in logs
and checkpoints
. For example, the following script will load the config configs/***.py
. The trained model will be saved as logs/xxxx/final
.
# Train MOF model
python -u train_MOFNet.py --name MOF-things --stage things --validation sintel
python -u train_MOFNet.py --name MOF-sintel --stage sintel --validation sintel
python -u train_MOFNet.py --name MOF-kitti --stage kitti --validation sintel
# Train BOF model
python -u train_BOFNet.py --name BOF-things --stage things --validation sintel
python -u train_BOFNet.py --name BOF-sintel --stage sintel --validation sintel
python -u train_BOFNet.py --name BOF-kitti --stage kitti --validation sintel
Evaluation
The script will load the config configs/multiframes_sintel_submission.py
or configs/sintel_submission.py
. Please change the _CN.model
in the config file to load corresponding checkpoints.
# Evaluate MOF_things.pth after C stage
python -u evaluate_MOFNet.py --dataset=sintel
python -u evaluate_MOFNet.py --dataset=things
python -u evaluate_MOFNet.py --dataset=kitti
# To evaluate MOF_sintel.pth, create submission to Sintel bechmark after C+S
python -u evaluate_MOFNet.py --dataset=sintel_submission_stride1
# To evaluate MOF_kitti.pth, create submission to Kitti bechmark after C+S+K
python -u evaluate_MOFNet.py --dataset=kitti_submission
Similarly, to evaluate BOF models:
# Evaluate BOF_things.pth after C stage
python -u evaluate_BOFNet.py --dataset=sintel
python -u evaluate_BOFNet.py --dataset=things
python -u evaluate_BOFNet.py --dataset=kitti
# To evaluate BOF_sintel.pth, create submission to Sintel bechmark after C+S
python -u evaluate_BOFNet.py --dataset=sintel_submission
# To evaluate BOF_kitti.pth, create submission to Kitti bechmark after C+S+K
python -u evaluate_BOFNet.py --dataset=kitti_submission
License
VideoFlow is released under the Apache License
Citation
@article{shi2023videoflow,
title={Videoflow: Exploiting temporal cues for multi-frame optical flow estimation},
author={Shi, Xiaoyu and Huang, Zhaoyang and Bian, Weikang and Li, Dasong and Zhang, Manyuan and Cheung, Ka Chun and See, Simon and Qin, Hongwei and Dai, Jifeng and Li, Hongsheng},
journal={arXiv preprint arXiv:2303.08340},
year={2023}
}
Acknowledgement
In this project, we use parts of codes in: