Jee-King / CVPR2022_STNet

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CVPR2022_STNet

Spiking Transformers for Event-based Single Object Tracking (CVPR 2022)

Jiqing Zhang, Bo Dong, Haiwei Zhang, Jianchuan Ding, Felix Heide, Baocai Yin, Xin Yang

[project] [paper]

The code is based on SiamFC++ and tested on Ubuntu 20.04 with PyTorch 1.8.0.

Test on FE240hz Dataset

  1. Download our preprocessed test dataset of FE240hz. (The whole FE240hz dataset can be downloaded here).

  2. Download the pretrained model and put it into ./snapshots/stnet.

  3. Change dataset path at line 32 in videoanalyst/engine/tester/tester_impl/eventdata.py. data_root="/your_data_path/img_120_split"

  4. run python main/test.py --config experiments/test/fe240/fe240.yaml the predicted bounding boxes are saved in logs/EVENT-Benchmark/.

    • The predicted bounding box format: An N×4 matrix with each line representing object location [xmin, ymin, width, height] in one event frame.

Test on VisEvent Dataset

  1. Download our preprocessing test dataset of VisEvent. (The whole VisEvent dataset can be downloaded here).

  2. Download the pretrained model and put it into ./snapshots/stnet.

  3. Change dataset path at line 32 in videoanalyst/engine/tester/tester_impl/eventdata.py, data_root="/your_data_path/img_120_split"

  4. Change model path at line 25 in experiments/test/fe240/fe240.yaml, pretrain_model_path: "snapshots/stnet/fe240.pkl"

  5. run python main/test.py --config experiments/test/fe240/fe240.yaml the predicted bounding boxes are be saved in logs/EVENT-Benchmark/.

    • The predicted bounding box format: An N×4 matrix with each line representing object location [xmin, ymin, width, height] in one event frame.

Citation

Please cite our paper if you find the work useful:

@inproceedings{zhang2022stnet,
  title={Spiking Transformers for Event-based Single Object Tracking},
  author={Zhang, Jiqing  and Dong, Bo and Zhang, Haiwei and Ding, Jianchuan and Heide, Felix and Yin, Baocai and Yang, Xin},
  booktitle={Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition},
  year={2022}
}

About

License:MIT License


Languages

Language:Python 94.3%Language:C 4.4%Language:Cython 1.3%Language:Shell 0.0%