CalayZhou / Multispectral-Pedestrian-Detection-Resource

A list of resouces for multispectral pedestrian detection,including the datasets, methods, annotations and tools.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Multispectral Pedestrian Detection Resource

A list of resouces for multispectral pedestrian detection,including the datasets, methods, annotations, evaluation and tools.


Datasets

  • KAIST dataset: The KAIST Multispectral Pedestrian Dataset consists of 95k color-thermal pairs (640x480, 20Hz) taken from a vehicle. All the pairs are manually annotated (person, people, cyclist) for the total of 103,128 dense annotations and 1,182 unique pedestrians. The annotation includes temporal correspondence between bounding boxes like Caltech Pedestrian Dataset.

  • CVC-14 dataset: The CVC-14 dataset is composed by two sets of sequences. These sequences are named as the day and night sets, which refers to the moment of the day they were acquired, and Visible and FIR depending the camera that was user to recor the sequences. For training 3695 images during the day, and 3390 images during night, with around 1500 mandatory pedestrian annotated for each sequence. For testing around 700 images for both sequences with around 2000 pedestrian during day, and around 1500 pedestrian during night.

  • FLIR dataset: Synced annotated thermal imagery and non-annotated RGB imagery for reference. It should to noted that the infrared and RGB images are not aligned. The FLIR dataset has 10,228 total frames and 9,214 frames with bounding boxes(28151 Person, 46692 Car, 4457 Bicycle, 240 Dog, 2228 Other Vehicle).

    In the original FLIR dataset, the thermal and visible images are not aligned. So Heng Zhang et al manually aligned the visible-thermal image pairs and end up with 4128 pairs for training and 1013 pairs for validation. The aligned version dataset can be downloaded here: https://drive.google.com/file/d/1xHDMGl6HJZwtarNWkEV3T4O9X4ZQYz2Y/view (This aligned dataset was firstly mentioned in Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks, ICIP 2020, Heng Zhang et al.)

  • LLVIP dataset: This dataset contains 30976 images, or 15488 pairs, most of which were taken at very dark scenes, and all of the images are strictly aligned in time and space. Pedestrians in the dataset are labeled. We compare the dataset with other visible-infrared datasets and evaluate the performance of some popular visual algorithms including image fusion, pedestrian detection and image-to-image translation on the dataset.

  • Autonomous Vehicles dataset: A novel multispectral dataset was generated for autonomous vehicles that consists of RGB, NIR, MIR, and FIR images, which prepared 7,512 images in total (3,740 taken at daytime and 3,772 taken at nighttime).


Methods

before 2018

  • Multispectral Pedestrian Detection Benchmark Dataset and Baseline, 2015, Soonmin Hwang et al. [PDF] [Code]

  • Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks, 2016, Jörg Wagner et al. [PDF]

  • Multispectral Deep Neural Networks for Pedestrian Detection, 2016, Jingjing Liu et al. [PDF] [Code]

  • Multi-spectral Pedestrian Detection Based on Accumulated Object Proposal with Fully Convolutional Networks, 2016, Hangil Choi et al. [PDF]

  • Fully Convolutional Region Proposal Networks for Multispectral Person Detection, 2017, Daniel König et al. [PDF]

  • Unified Multi-spectral Pedestrian Detection Based on Probabilistic Fusion Networks, 2017, Kihong Park et al. [PDF]

2018

  • Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection, 2018, Dayan Guan et al. [PDF] [Code]

  • Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection, BMVC 2018, Chengyang Li et al. [PDF] [Code]

  • Pedestrian detection at night by using Faster R-CNN infrared images, 2018, Michelle Galarza Bravo et al. [PDF]

  • Real-Time Multispectral Pedestrian Detection with a Single-Pass Deep Neural Network, 2018, Maarten Vandersteegen et al. [PDF]

  • Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation, BMVC 2018, Chengyang Li et al. [PDF] [Code] [Project Link]

2019

  • Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pesdestrian Detecion, 2019, Yanpeng Cao et al. [PDF] [Code]

  • Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection, ICCV 2019, Lu Zhang et al. [PDF] [Code]

  • The Cross-Modality Disparity Problem in Multispectral Pedestrian Detection, 2019, Lu Zhang et al. [PDF]

  • Cross-modality interactive attention network for multispectral pedestrian, 2019, Lu Zhang et al. [PDF] [Code]

  • GFD-SSD Gated Fusion Double SSD for Multispectral Pedestrian Detection, 2019, Yang Zheng et al. [PDF]

  • Unsupervised Domain Adaptation for Multispectral Pedestrian Detection, 2019, Dayan Guan et al. [PDF] [Code]

  • Generalization ability of region proposal networks for multispectral person detection, 2019, Kevin Fritz et al.[PDF]

  • Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery, 2019, Chaitanya Devaguptapu et al. [PDF]

2020

  • Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks, ICIP 2020, Heng Zhang et al. [PDF]

  • Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems, ECCV 2020, Kailai Zhou et al. [PDF][Code]

  • Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery, ECCV 2020, My Kieu et al. [PDF]

  • Anchor-free Small-scale Multispectral Pedestrian Detection, BMVC 2020, Alexander Wolpert et al. [PDF][Code]

  • Robust pedestrian detection in thermal imagery using synthesized images, ICPR 2020, My Kieu et al.[PDF]

2021

  • Pixel Invisibility: Detecting Objects Invisible in Color Image, 2021, Yongxin Wang et al.[PDF]

  • Guided Attentive Feature Fusion for Multispectral Pedestrian Detection, WACV 2021, Heng Zhang et al. [PDF]

  • Deep Active Learning from Multispectral Data Through Cross-Modality Prediction Inconsistency, ICIP2021, Heng Zhang et al. [PDF]

  • Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving, Kinjal Dasgupta et al. [PDF]

  • Uncertainty-Guided Cross-Modal Learning for Robust Multispectral Pedestrian Detection, IEEE Transactions on Circuits and Systems for Video Technology 2021, Jung Uk Kim et al. [PDF]

  • Cross-Modality Fusion Transformer for Multispectral Object Detection, 2021, Qingyun Fang et al. [PDF]

  • Weakly Aligned Feature Fusion for Multimodal Object Detection, 2021, Lu Zhang et al. [PDF]

  • Attention Fusion for One-Stage Multispectral Pedestrian Detection, 2021, Zhiwei Cao et al. [PDF]

  • Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU, 2021, Napat Wanchaitanawong et al. [PDF]

  • MLPD: Multi-Label Pedestrian Detector in Multispectral Domain, 2021, Jiwon Kim et al. [PDF]

  • [survey] From handcrafted to deep features for pedestrian detection: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 2021, Jiale Cao et al. [PDF]

2022

  • Low-Cost Multispectral Scene Analysis With Modality Distillation, WACV 2022, Heng Zhang et al. [PDF]

  • Confidence-aware Fusion using Dempster-Shafer Theory for Multispectral Pedestrian Detection, IEEE Transactions on Multimedia 2022, Qing Li et al. [PDF]

  • PIAFusion: A progressive infrared and visible image fusion network based on illumination aware, Information Fusion, Linfeng Tang et al. [PDF] [Code]

  • Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy, Remote Sensing, Qingwang Wang et al. [PDF]

  • Spatio-contextual deep network-based multimodal pedestrian detection for autonomous driving, IEEE Transactions on Intelligent Transportation Systems, Kinjal Dasgupta et al. [PDF]

  • Robust Thermal Infrared Pedestrian Detection By Associating Visible Pedestrian Knowledge, ICASSP 2022, Sungjune Park et al. [PDF]

  • Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning, IEEE Transactions on Circuits and Systems for Video Technology, Yiming Sun. [PDF] [Code]

  • Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory, AAAI2022, Jung Uk Kim et al. [PDF]

  • Bispectral Pedestrian Detection Augmented with Saliency Maps using Transformer, VISIGRAPP2022, Mohamed Amine Marnissi et al. [PDF]

  • Attention-Guided Multi-modal and Multi-scale Fusion for Multispectral Pedestrian Detection, PRCV2022, Wei Bao et al. [PDF]

  • Improving Rgb-Infrared Pedestrian Detection by Reducing Cross-Modality Redundancy, ICIP2022, Qingwang Wang et al. [PDF]

  • Attention-Based Cross-Modality Feature Complementation for Multispectral Pedestrian Detection, IEEE Access, Qunyan Jiang et al. [PDF]

  • DMFFNet: Dual-Mode Multi-Scale Feature Fusion-Based Pedestrian Detection Method, IEEE Access, Ruizhe Hu et al. [PDF]

  • LGADet: Light-weight Anchor-free Multispectral Pedestrian Detection with Mixed Local and Global Attention, Neural Processing Letters, Xin Zuo et al. [PDF]

  • Locality guided cross-modal feature aggregation and pixel-level fusion for multispectral pedestrian detection, Information Fusion, Yanpeng Cao et al. [PDF]

  • BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection, ICRA2022, Xiaoxiao Yang et al. [PDF]

  • RGB-Thermal based Pedestrian Detection with Single-Modal Augmentation and ROI Pooling Multiscale Fusion, IGARSS2022, Jiajun Xiang et al. [PDF]

  • MPDFF: Multi-source Pedestrian detection based on Feature Fusion, IGARSS2022, Lingxuan Meng et al. [PDF]

  • Modality-Independent Regression and Training for Improving Multispectral Pedestrian Detection, ICIVC2022, Han Ni et al. [PDF]

  • Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection, ACM MM2022, Jin Xie et al. [PDF]

  • Multimodal Object Detection via Probabilistic Ensembling, ECCV2022(oral), Yi-Ting Chen et al. [PDF] [Code]

  • Translation, Scale and Rotation Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection, ECCV2022, Yuan Maoxun et al. [PDF]

2023

  • [survey] RGB-T image analysis technology and application: A survey, Engineering Applications of Artificial Intelligence, Kechen Song et al. [PDF]
  • [survey] Visible-infrared cross-modal pedestrian detection: a summary, Qian Bie et al. [PDF]
  • HAFNet: Hierarchical Attentive Fusion Network for Multispectral Pedestrian Detection, Remote Sensing, Peiran Peng et al. [PDF]
  • Local Adaptive Illumination-Driven Input-Level Fusion for Infrared and Visible Object Detection, Remote Sensing, Jiawen Wu et al. [PDF]
  • Multiscale Cross-modal Homogeneity Enhancement and Confidence-aware Fusion for Multispectral Pedestrian Detection, IEEE Transactions on Multimedia, Ruimin Li et al. [PDF]
  • Transformer fusion and histogram layer multispectral pedestrian detection network, Signal, Image and Video Processing, Ying Zang et al. [PDF]
  • DaCFN: divide-and-conquer fusion network for RGB-T object detection, International Journal of Machine Learning and Cybernetics, Bofan Wang et al. [PDF]
  • Cross-modality complementary information fusion for multispectral pedestrian detection, Neural Computing and Applications, Chaoqi Yan et al. [PDF]
  • IGT: Illumination-guided RGB-T object detection with transformers, Knowledge-Based Systems, Keyu Chen et al. [PDF]
  • Learning to measure infrared properties of street views from visible images, Measurement, Lei Wang et al. [PDF]
  • Multispectral Pedestrian Detection via Reference Box Constrained CrossAttention and Modality Balanced Optimization, Yinghui Xing et al. [PDF]
  • Cascaded information enhancement andcross-modal attention feature fusion formultispectral pedestrian detection, Yang Yang et al. [PDF]
  • Cross-Modality Attention and Multimodal Fusion Transformer for Pedestrian Detection, ECCV 2022 Workshops, Wei-Yu Lee et al. [PDF]
  • REVISITING MODALITY IMBALANCE IN MULTIMODAL PEDESTRIAN DETECTION, Arindam Das et al. [PDF]
  • Illumination-Guided RGBT Object Detection With Inter- and Intra-Modality Fusion, IEEE Transactions on Instrumentation and Measurement, Yan Zhang et al. [PDF]
  • MCANet: Multiscale Cross-Modality Attention Network for Multispectral Pedestrian Detection, MultiMedia Modeling, Xiaotian Wang et al. [PDF]
  • Multi-modal pedestrian detection with misalignment based on modal-wise regression and multi-modal IoU, Journal of Electronic Imaging, Napat Wanchaitanawong et al. [PDF]

2024

  • Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection, CVPR 2024, Taeheon Kim et al. [PDF)]

Improved KAIST Annotations

  • Training Annotations: The KAIST Multispectral Pedestrian Dataset has three kinds of annotations for training. First, the original annotations were provided by Hwang et al. [1]. Second, the sanitized annotations were provided by Li et al. [2]. Lastly, the paired annotations were provided by Zhang et al. [3].

  • Test Annotations: The sanitized annotations [2] are used in this challenge for evaluation. The sanitized annotations eliminate the annotation errors, including imprecise localization, misclassification and misaligned regions. The annotations are mostly used in recent works for evaluation, and therefore we also adopt the annotations to conduct a fair comparison.

    [1] - S. Hwang, J. Park, N. Kim, Y. Choi, and I. Kweon, “Multispectral pedestrian detection: Benchmark dataset and baseline,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2015, pp. 1037–1045

    [2] - C. Li, D. Song, R. Tong, and M. Tang, “Multispectral pedestrian detection via simultaneous detection and segmentation,” in Proc. Brit. Mach. Vision Conf., 2018, pp. 225.1–225.12.

    [3] - L. Zhang, X. Zhu, X. Chen, X. Yang, Z. Lei, and Z. Liu, “Weakly aligned cross-modal learning for multispectral pedestrian detection,” in Proc. IEEE Int. Conf. Comput. Vision, 2019, pp. 5126–5136.


Tools

About

A list of resouces for multispectral pedestrian detection,including the datasets, methods, annotations and tools.