matrixgame2018 / TarDAL

CVPR 2022 | Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TarDAL

Open In Colab visitors

Jinyuan Liu, Xin Fan*, Zhangbo Huang, Guanyao Wu, Risheng Liu , Wei Zhong, Zhongxuan Luo,“Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral)


Abstract


M3FD Dataset

Preview

The preview of our dataset is as follows.


preview gif1


Details

  • Sensor: A synchronized system containing one binocular optical camera and one binocular infrared sensor. More details are available in the paper.

  • Main scene:

    • Campus of Dalian University of Technology.
    • State Tourism Holiday Resort at the Golden Stone Beach in Dalian, China.
    • Main roads in Jinzhou District, Dalian, China.
  • Total number of images:

    • 8400 (for fusion, detection and fused-based detection)
    • 600 (independent scene for fusion)
  • Total number of image pairs:

    • 4200 (for fusion, detection and fused-based detection)
    • 300 (independent scene for fusion)
  • Image size: 1024 x 768 pixels (mostly)

  • Registration: All image pairs are registered. The visible images are calibrated by using the internal parameters of our synchronized system, and the infrared images are artificially distorted by homography matrix.

  • Labeling: 34407 labels have been manually labeled, containing 6 kinds of targets: {People, Car, Bus, Motorcycle, Lamp, Truck}. (Limited by manpower, some targets may be mismarked or missed. We would appreciate if you would point out wrong or missing labels to help us improve the dataset)

Download

File structure

  M3FD
  ├── Challenge
  |   ├── Beach
  |   |   ├──Annotation
  |   |   |  ├── 01863.xml
  |   |   |  └── ...
  |   |   ├──Ir
  |   |   |  ├── 01863.png
  |   |   |  └── ...
  |   |   ├──Vis
  |   |   |  ├── 01863.png
  |   |   |  └── ...
  |   ├── Crossroads
  |   └── ...
  ├── Daytime
  |   ├── Alley
  |   └── ...
  ├── Night
  |   ├── Basement
  |   └── ...
  └── Overcast
      ├── Atrium
      └── ...

If you have any question or suggestion about the dataset, please email to Guanyao Wu or Jinyuan Liu.

TarDAL Fusion

Baselines(Sorted alphabetically)

Fuse Quick Start Examples

You can try our method online (free) in Colab.

Install

We recommend you to use the conda management environment.

conda create -n tardal python=3.8
conda activate tardal
pip install -r requirements.txt

Fuse or Eval

We offer three pre-trained models.

Name Description
TarDAL Optimized for human vision. (Default)
TarDAL+ Optimized for object detection.
TarDAL++ Optimal solution for joint human vision and detection accuracy.
python fuse.py --src data/sample/s1 --dst runs/sample/tardal --weights weights/tardal.pt --color
python fuse.py --src data/sample/s1 --dst runs/sample/tardal+ --weights weights/tardal+.pt --color --eval
python fuse.py --src data/sample/s1 --dst runs/sample/tardal++ --weights weights/tardal++.pt --color --eval

--color will colorize the fused images with corresponding visible color space.

If you have any question about the code, please email to Zhanbo Huang.

Citation

@inproceedings{liu2022target,
  title={Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection},
  author={Liu, Jinyuan and Fan, Xin and Huang, Zhanbo and Wu, Guanyao and Liu, Risheng and Zhong, Wei and Luo, Zhongxuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5802--5811},
  year={2022}
}

About

CVPR 2022 | Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection.


Languages

Language:Jupyter Notebook 98.5%Language:Python 1.5%