Awesome RGB-T Feature Fusion

A collection of RGB-T-Feature-Fusion methods (deep learning methods mainly), codes, and datasets.
The main directions involved are Multispectral Pedestrian, RGB-T Vehicle Detection, RGB-T Crowd Counting, RGB-T Fusion Tracking.
Feel free to star and fork! We will continue to update this repository!

Multispectral Pedestrian
RGB-T Vehicle Detection
RGB-T Crowd Counting
RGB-T Salient Object Detection
RGB-T Fusion Tracking

Multispectral-Pedestrian

Datasets and Annotations

KAIST dataset, CVC-14 dataset , FLIR dataset, LLVIP dataset, M³FD dataset

Improved KAIST Testing Annotations provided by Liu et al.Link to download
Sanitized KAIST Training Annotations provided by Li et al.Link to download
Improved KAIST Training Annotations provided by Zhang et al.Link to download

Tools

Evalutaion codes.Link to download
Annotation: vbb format->xml format.Link to download

Papers

Fusion Architecture

DetFusion: A Detection-driven Infrared and Visible Image Fusion Network, ACM Multimedia 2022, Yiming Sun et al. [PDF]
Multimodal Object Detection via Probabilistic Ensembling, ECCV2022, Yi-Ting Chen et al. [PDF]
Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection, ACM Multimedia 2022, Jin Xie et al. [PDF]
Confidence-aware Fusion using Dempster-Shafer Theory for Multispectral Pedestrian Detection, TMM 2022, Qing Li et al. [PDF]
Attention-Guided Multi-modal and Multi-scale Fusion for Multispectral Pedestrian Detection, PRCV 2022, Wei Bao et al. [PDF]
Improving RGB-Infrared Pedestrian Detection by Reducing Cross-Modality Redundancy, ICIP2022, Qingwang Wang et al. [PDF]
Spatio-contextual deep network-based multimodal pedestrian detection for autonomous driving, IEEE Transactions on Intelligent Transportation Systems, Kinjal Dasgupta et al. [PDF]
Adopting the YOLOv4 Architecture for Low-LatencyMultispectral Pedestrian Detection in Autonomous Driving, Sensors 2022, Kamil Roszyk et al. [PDF]
Deep Active Learning from Multispectral Data Through Cross-Modality Prediction Inconsistency, ICIP2021, Heng Zhang et al.[PDF]
Attention Fusion for One-Stage Multispectral Pedestrian Detection, Sensors 2021, Zhiwei Cao et al. [PDF]
Uncertainty-Guided Cross-Modal Learning for Robust Multispectral Pedestrian Detection, IEEE Transactions on Circuits and Systems for Video Technology 2021, Jung Uk Kim et al. [PDF]
Deep Cross-modal Representation Learning and Distillation for Illumination-invariant Pedestrian Detection, IEEE Transactions on Circuits and Systems for Video Technology 2021, T. Liu et al. [PDF]
Guided Attentive Feature Fusion for Multispectral Pedestrian Detection, WACV 2021, Heng Zhang et al. [PDF]
Anchor-free Small-scale Multispectral Pedestrian Detection, BMVC 2020, Alexander Wolpert et al. [PDF][Code]
Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks, ICIP 2020, Heng Zhang et al. [PDF]
Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems, ECCV 2020, Kailai Zhou et al. [PDF][Code]
Anchor-free Small-scale Multispectral Pedestrian Detection, BMVC 2020, Alexander Wolpert et al. [PDF][Code]
Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection, ICCV 2019, Lu Zhang et al. [PDF][Code]
Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pesdestrian Detecion, ISPRS Journal of Photogrammetry and Remote Sensing 2019, Yanpeng Cao et al.[PDF][Code]
Cross-modality interactive attention network for multispectral pedestrian detection, Information Fusion 2019, Lu Zhang et al.[PDF][Code]
Pedestrian detection with unsupervised multispectral feature learning using deep neural networks, Information Fusion 2019, Cao, Yanpeng et al.[PDF]
Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation, BMVC 2018, Chengyang Li et al.[PDF][Code][Project Link]
Unified Multi-spectral Pedestrian Detection Based on Probabilistic Fusion Networks, Pattern Recognition 2018, Kihong Park et al.[PDF]
Multispectral Deep Neural Networks for Pedestrian Detection, BMVC 2016, Jingjing Liu et al.[PDF][Code]
Multispectral Pedestrian Detection Benchmark Dataset and Baseline, 2015, Soonmin Hwang et al.[PDF][Code]

Illumination Aware

Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery, ECCV 2020, My Kieu et al. [PDF][Code]
Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection, Information Fusion 2019, Dayan Guan et al.[PDF][Code]
Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection, Pattern Recognition 2018, Chengyang Li et al.[PDF][Code]

Feature Alignment

Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory, AAAI2022, Jung Uk Kim et al. [PDF]
Mlpd: Multi-label pedestrian detector in multispectral domain, IEEE Robotics and Automation Letters 2021, Jiwon Kim et al. [PDF]
Weakly Aligned Feature Fusion for Multimodal Object Detection, ITNNLS 2021, Lu Zhang et al. [PDF]
Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems, ECCV 2020, Kailai Zhou et al. [PDF][Code]
Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection, ICCV 2019, Lu Zhang et al. [PDF] [Code]
Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation, BMVC 2018, Chengyang Li et al. [PDF] [Code]

Single Modality

Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory, AAAI 2022, Kim et al. [PDF]
Robust Thermal Infrared Pedestrian Detection By Associating Visible Pedestrian Knowledge, ICASSP 2022, Sungjune Park et al. [PDF]
Low-cost Multispectral Scene Analysis with Modality Distillation, Zhang Heng et al. [PDF]
Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery, ECCV 2020, My Kieu et al. [PDF][Code]
Deep Cross-modal Representation Learning and Distillation for Illumination-invariant Pedestrian Detection, IEEE Transactions on Circuits and Systems for Video Technology 2021, T. Liu et al. [PDF]

Unsupervised Domain Adaptation

Unsupervised Domain Adaptation for Multispectral Pedestrian Detection, CVPR 2019 Workshop , Dayan Guan et al. [PDF] [Code]
Pedestrian detection with unsupervised multispectral feature learning using deep neural networks, Information Fusion 2019, Y. Cao et al. Information Fusion 2019, [PDF] [Code]
Learning crossmodal deep representations for robust pedestrian detection, CVPR 2017, D. Xu et al.[PDF][Code]

RGB-T Vehicle Detection

Datasets

DroneVehicle[link], Multispectral Datasets for Detection and Segmentation[link]

papers

GF-Detection: Fusion with GAN of Infrared and Visible Images for Vehicle Detection at Nighttime, Remote Sensing 2022, Peng Gao et al. [PDF]
Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery, Pattern Recognition, Qingyun Fang et al. [PDF]
Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection, ECCV 2022, Maoxun Yuan et al. [PDF]
Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning, TCSVT 2022, Yiming Sun [PDF]
Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy, Remote Sensing 2022, Qingwang Wang et al. [PDF]

RGB-T Crowd Counting

Datasets

RGBT-CC[link], DroneCrowd [link]

papers

Domain Adaptation

RGB-T Crowd Counting from Drone: A Benchmark and MMCCN Network, ACCV2020, Tao Peng et al. [PDF][Code]

Fusion Architecture

MAFNet: A Multi-Attention Fusion Network for RGB-T Crowd Counting, arxiv2022, Pengyu Chen et al. [PDF]
Multimodal Crowd Counting with Mutual Attention Transformers, ICME 2022, Wu, Zhengtao et al. [PDF]
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting, CVPR2021, Lingbo Liu et al. [PDF][Code]

RGB-T Salient Object Detection

Datasets

VT821 Dataset [PDF][link], VT1000 Dataset [PDF][link], VT5000 Dataset [PDF][link[9yqv]]

papers

Domain Adaptation

Multi-Spectral Salient Object Detection by Adversarial Domain Adaptation, AAAI 2020, Shaoyue Song et al.[PDF]
Deep Domain Adaptation Based Multi-spectral Salient Object Detection, TMM 2020, Shaoyue Song et al.[PDF]

Fusion Architecture

Multi-Interactive Dual-Decoder for RGB-Thermal Salient Object Detection, TIP 2021, Wu, Zhengtao et al.[PDF]

RGB-T Fusion Tracking

papers

Visual Prompt Multi-Modal Tracking, CVPR 2023, Jiawen Zhu et al. [PDF][Code]
Prompting for Multi-Modal Tracking, ACM Multimedia 2022, Jinyu Yang et al. [PDF]]

heitorrapela / Awesome-RGBT-Feature-Fusion

Awesome RGB-T Feature Fusion

Contents

Multispectral-Pedestrian

Datasets and Annotations

Tools

Papers

Fusion Architecture

Illumination Aware

Feature Alignment

Single Modality

Unsupervised Domain Adaptation

RGB-T Vehicle Detection

Datasets

papers

RGB-T Crowd Counting

Datasets

papers

Domain Adaptation

Fusion Architecture

RGB-T Salient Object Detection

Datasets

papers

Domain Adaptation

Fusion Architecture

RGB-T Fusion Tracking

papers

About