Deep Learning for Visual Intelligence: Trends and Challenges

Course information

Instructor: WANG Lin (linwang@ust.hk)
TAs: SU Ying (ysuay@connect.ust.hk) and ZHU Qingyan (qzhuai@connect.ust.hk)
Class time: Tuesday & Thursday 16.30 -17.50
Office Hours: BY appointment only.

Course description

This is a task-oriented yet interaction-based course, which aims to scrutinize the recent trends and challenges of deep learning in visual intelligence tasks (learning methods, high- and low-level vision problems). This course will follow the way of flipped-classroom manner where the lecturer teaches the basics; meanwhile, the students will also be focused on active discussions, presentations (lecturing), and hands-on research projects under the guidance of the lecturer in the whole semester. Through this course, students will be equipped with the capability to critically challenge the existing methodologies/techniques and hopefully make breakthroughs in some new research directions.

Grading policy

Paper summary (10%)
Paper presentation and discussion (30%)
Group project and paper submission (50%)
Attendance and Participation (10%)

Tentative schedule

Dates	Topics	Active Learning
2/8	Course introduction
2/10	Course introduction	Overview of computer vision
2/15	Deep learning basics	TAs’ lectures for CNN basics, algorithm basics and Pytorch tuorial
2/17	Deep learning basics	TAs’ lectures for CNN basics, algorithm basics and Pytorch tuorial
2/22	DNN models in computer vision (GAN, RNN, RNN)
2/24	DNN models in computer vision (GAN, RNN, RNN)	(1) Persenation (2) Review due 2/27 (3) Project meetings
3/1	Learning methods in computer vision (Transfer learning, domain adaptation, self/semi-supervised learning)
3/3	Learning methods in computer vision ((Transfer learning, domain adaptation, self/semi-supervised learning))	(1) Persenation (2) Review due 3/6
3/8	Deep learning for image restoration and enhancement (I) deblurring, deraining, dehazing
3/10	Deep learning for image restoration and enhancement (I) deblurring, deraining, dehazing	(1) Persenation (2) Review due 3/13 (3) Project proposal kick-off (one page)
3/15	Deep learning for image restoration and enhancement (II) Super-resolution, HDR imaging
3/17	Deep learning for image restoration and enhancement (II) Super-resolution, HDR imaging	(1) Persenation (2) Review due 3/20
3/22	Deep learning for scene understanding (I) Object detection & tracking
3/24	Deep learning for scene understanding (I) Object detection & tracking	Project mid-term presentation
3/29	Deep learning for scene understanding (II) Semantic segmentation
3/31	Deep learning for scene understanding (II) Semantic segmentation	(1) Persenation (2) Review due 4/3
4/5	Computer vision with novel cameras (I) Event camera-based vision
4/7	Computer vision with novel cameras (I) Event camera-based vision	(1) Persenation (2) Review due 4/10
4/12	Computer vision with novel cameras (II) Thermal/360 camera-based vision
4/14	Computer vision with novel cameras (II) Thermal/360 camera-based vision	(1) Persenation (2) Review due 4/17 (3) Project meetings
4/19	Depth and Motion Estimation in Vision
4/21	Depth and Motion Estimation in Vision	(1) Persenation (2) Review due 4/24
4/26	Adversarial robustness in computer vision (Adversrial attack and defense)
4/28	Adversarial robustness in computer vision (Adversrial attack and defense)	(1) Persenation (2) Review due 4/31 (3) Project meetings
5/3	Potential and Challenges in computer vision (data, computation, learning, sensor) (self-driving and robotics)
5/5	Potential and Challenges in computer vision (data, computation, learning, sensor) (self-driving and robotics)	(1) TA/Student lectures (2) final project Q/A
5/10	Project presentation and final paper submission
5/12	Project presentation and final paper submission	Submission due 5/26

Reading list

DNN models in computer vision (VAEs, GANs, RNNs)

VAEs

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.
[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.
[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.

GANs

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.
[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.
[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[Liu et al. 17] UNIT: Unsupervised Image-to-Image Translation Networks, NeurIPS 2017.
[Choi et al. 18]StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018.
[Isola et al. 17] Image-to-Image Translation with Conditional Adversarial Networks, CVPR, 2017.
[Huang et al. 17] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization, ICCV, 2017.
[Huang et al. 18] Multimodal Unsupervised Image-to-Image Translation, ECCV, 2018.

[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[Karras et al. 20] Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.
[Park et al. 20] Contrastive Learning for Unpaired Image-to-Image Translation, ECCV 2020.
[Karras et al. 20] Training Generative Adversarial Networks with Limited Data, NeurIPS 2020.
[Xie et al. 20] Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation, ECCV 2020.
[Mustafa et al. 20] Transformation Consistency Regularization– A Semi-Supervised Paradigm for Image-to-Image Translation, ECCV 2020.
[Li et al. 20] Semantic Relation Preserving Knowledge Distillation for Image-to-Image Translation, ECCV, 2020.
[Xu et al. 21] Linear Semantics in Generative Adversarial Networks, CVPR, 2021.
[Cao et al. 21] ReMix: Towards Image-to-Image Translation with Limited Data, CVPR 2021.
[Liu et al. 21] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network, CVPR 2021.
[Pizzati et al. 21] CoMoGAN: continuous model-guided image-to-image translation, CVPR 2021.
[Jin et al. 21] Teachers Do More Than Teach: Compressing Image-to-Image Models, CVPR 2021.
[Baek et al. 21] Rethinking the Truly Unsupervised Image-to-Image Translation, ICCV, 2021.
[Wang et al. 21] TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets, ICCV, 2021.
[Yang et al. 21] Global and Local Alignment Networks for Unpaired Image-to-Image Translation, Arxiv 2021.
[Jiang et al. 21] Focal Frequency Loss for Image Reconstruction and Synthesis, ICCV, 2021.

Learning methods in computer vision

Knowledge transfer

[Wang et al. 21] Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks, TPAMI, 2021.
[Hiton et al. 15] Distilling the Knowledge in a Neural Network, NIPS Workshop, 2015.
[Romero et al. 15] FitNets: Hints for Thin Deep Nets, ICLR, 2015.
[Gupta et al. 16] Cross Modal Distillation for Supervision Transfer, CVPR, 2016.
[Zagoruyko et al. 16] Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR, 2017.
[Furlanello et al. 18] Born Again Neural Networks, ICML, 2018.
[Zhang et al. 18] Deep Mutual Learning, CVPR,2018.
[Tarvainen et al. 18]Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, NIPS, 2018.
[Zhang et al. 19] Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation, ICCV, 2019.
[Heo et al. 19] A Comprehensive Overhaul of Feature Distillation, ICCV, 2019.
[Tung et al.19] Similarity-Preserving Knowledge Distillation, ICCV, 2019.

[Chen et al. 19] DAFL:Data-Free Learning of Student Networks, ICCV, 2019.
[Ahn et al. 19] Variational Information Distillation for Knowledge Transfer, CVPR, 2019.
[Tian et al. 20] Contrastive Representation Distillation, ICLR, 2020.
[Fang et al. 20] Data-Free Adversarial Distillation, CVPR, 2020.
[Yang et al. 20] MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution, ECCV, 2020.
[Yao et al. 20] Knowledge Transfer via Dense Cross-layer Mutual-distillation. ECCV 2020
[Guo et al. 20] Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation, Arxiv, 2020.
[Ji et al. 21] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation, CVPR, 2021.
[Liu et al. 21] Source-Free Domain Adaptation for Semantic Segmentation, CVPR, 2021.
[Chen et al. 21] Learning Student Networks in the Wild, CVPR, 2021.
[Xue et a. 21] Multimodal Knowledge Expansion，ICCV, 2021.
[ZHu et al. 21] Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher, ICCV, 2021.
[Kim et al. 21] Self-Knowledge Distillation with Progressive Refinement of Targets, ICCV, 2021.
[Son et al. 21] Densely Guided Knowledge Distillation using Multiple Teacher Assistants, ICCV, 2021.

Domain Adaptation

[Long et al. 15] Learning Transferable Features with Deep Adaptation Networks, ICML, 2015.
[Tzeng et al. 17] Adversarial Discriminative Domain Adaptation, CVPR, 2017.
[Huang et al. 18] Domain Transfer Through Deep Activation Matching, ECCV, 2018.
[Bermu’dez-Chaco’n et al. 20] Domain Adaptive Multibranch Networks, ICLR, 2020.
[Carlucci et al. 17] AutoDIAL: Automatic DomaIn Alignment Layers, ICCV, 2017.
[Chang et al. 19] Domain-Specific Batch Normalization for Unsupervised Domain Adaptation, CVPR, 2019.
[Cui et al. 20] Towards Discriminability and Diversity:Batch Nuclear-norm Maximization under Label Insufficient Situations, CVPR 2020.
[Roy et al. 19] Unsupervised Domain Adaptation using Feature-Whitening and Consensus Loss, CVPR, 2019.
[Csurka et al. 17] Discrepancy-based networks for unsupervised domain adaptation: a comparative study, CVPRW, 2017.
[Murez et al. 18] Image to Image Translation for Domain Adaptation, CVPR, 2018.
[Liu et al. 17] Coupled Generative Adversarial Networks, NIPS, 2017.
[Hoffman et al. 18] CyCADA: Cycle-Consistent Adversarial Domain Adaptation, ICLR, 2018.
[Lee et al. 18] Diverse Image-to-Image Translation via Disentangled Representations, ECCV, 2018.
[Chen et al. 12] Marginalized Denoising Autoencoders for Domain Adaptation, ICML, 2012.
[Zhuang et al. 15] Supervised Representation Learning: Transfer Learning with Deep Autoencoders, IJCAI, 2015.
[ Ghifary et al. 16] Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation, ECCV, 2016.
[Bousmalis et al. 16] Domain Separation Networks, NIPS, 2016.
[French et al. 19] Self-ensembling for Visual Domain Adaptation, ICLR, 2019.
[Shu et al. 18] A DIRT-T Approach to Unsupervised Domain Adaptation, ICLR, 2018.
[ Deng et al. 19] Cluster Alignment with a Teacher for Unsupervised Domain Adaptation, ICCV, 2019.
[Chen et al. 19] Progressive Feature Alignment for Unsupervised Domain Adaptation, CVPR 2019.
[Zhang et al. 18] Progressive Feature Alignment for Unsupervised Domain Adaptation, CVPR 2018.
[Kang et al. 19] Contrastive Adaptation Network for Unsupervised Domain Adaptation, CVPR 2019.

[Guizilini et al. 21] Geometric Unsupervised Domain Adaptation for Semantic Segmentation, ICCV, 2021.
[Wang et al. 20] Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation, ECCV, 2020.
[Peng et al. 20] Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation, ECCV, 2020.
[Liu et al. 21] Source-Free Domain Adaptation for Semantic Segmentation, CVPR, 2021.
[Na et al. 21] FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation, CVPR, 2021.
[Sharma et al. 21] Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation, CVPR, 2021.
[Ahmed et al. 21] Unsupervised Multi-source Domain Adaptation Without Access to Source Data, CVPR, 2021.
[He et al. 21] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation, CVPR, 2021.
[Wu et al. 21] DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation, CVPR, 2021.
[Lengyel et al. 21] Zero-Shot Day-Night Domain Adaptation with a Physics Prior, ICCV, 2021.
[Li et al. 21] Semantic Concentration for Domain Adaptation, ICCV, 2021.
[Awais et al. 21] Adversarial Robustness for Unsupervised Domain Adaptation, ICCV, 2021.

Semi-supervised learning

[Sajjadi et al. 16] Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning, NIPS, 2016.
[Laine et al. 17] Temporal Ensembling for Semi-Supervised Learning，ICLR, 2017.
[Tarvainen et al. 17] Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, NIPS, 2017.
[Miyato et al. 18] Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning, TPAMI, 2018.
[Verma et al. 19] Interpolation Consistency Training for Semi-Supervised Learning, NIPS, 2019.
[Lee et al. 13] Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks, ICML, 2013.
[Iscen et al. 19] Label Propagation for Deep Semi-supervised Learning, CVPR, 2019.
[Xie et al. 20] Self-training with Noisy Student improves ImageNet classification, CVPR, 2020.
[Berthelot et al. 19] MixMatch: A Holistic Approach to Semi-Supervised Learning, NIPS, 2019.
[Berthelot et al. 20] ReMixMatch: Semi-supervised learning with distribution alignment and augmentation anchoring, ICLR, 2020.
[Junnan Li et al. 20] DivideMix: Learning with Noisy Labels as Semi-supervised Learning, ICLR, 2020.
[Sohn et al. 20] FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence, NIPS, 2020.
[Quali et al. 20] An Overview of Deep Semi-Supervised Learning, 2020.

[Ke et al. 19] Dual Student: Breaking the Limits of the Teacher in Semi-supervised Learning, ICCV, 2019.
[Luo et al. 20] Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network, ECCV, 2020.
[Gao et al. 20] Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost, ECCV, 2020.
[Liu et al. 20] Generative View-Correlation Adaptation for Semi-Supervised Multi-View Learning, ECCV, 2020.
[Kuo et al. 20] FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning, ECCV, 2020.
[Mustafa et al. 20] Transformation Consistency Regularization – A Semi-Supervised Paradigm for Image-to-Image Translation, ECCV, 2020.
[Chen et al. 21] Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision,CVPR, 2021.
[Lai et al. 21] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning, CVPR,2021.
[Hu et al. 21] SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification,CVPR,2021.
[Zhou et al. 21] Pixel Contrastive-Consistent Semi-Supervised Semantic Segmentation, ICCV, 2021.
[Xiong et al. 21] Multiview Pseudo-Labeling for Semi-supervised Learning from Video, ICCV, 2021.

Image restoration and enhancement

Image Deblurring

[Xu et al. 14] Deep Convolutional Neural Network for Image Deconvolution, NIPS, 2014.
[Zhang et al. 22] Deep Image Deblurring: A Survey, Arxiv, 2022.
[Dong et al. 21] Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring, NIPS, 2021.
[Nimisha et al., 17] Blur-Invariant Deep Learning for Blind-Deblurring, ICCV, 2017.
[Nah et al. 17] Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring, CVPR, 2017.
[Kupyn et al. 19] DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better, ICCV, 2019.
[Zhang et al. 20] Deblurring by Realistic Blurring, CVPR, 2020.
[Zhou et al. 19] Spatio-Temporal Filter Adaptive Network for Video Deblurring, ICCV, 2019.
[Nah et al. 19] Recurrent Neural Networks with Intra-Frame Iterations for Video Deblurring, CVPR, 2019.
[Purohit et al. 20] Region-Adaptive Dense Network for Efficient Motion Deblurring, AAAI,2020. (SoTA of single image deblur on GoPro dataset)
[Shen et al. 19] Human-Aware Motion Deblurring, ICCV, 2019.

[Rim et al. 20] Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms, ECCV, 2020.
[Lin et al. 20] Learning Event-Driven Video Deblurring and Interpolation, ECCV, 2020.
[Zhong et al. 20] Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring, ECCV, 2020.
[Abuolaim et al. 20] Defocus Deblurring Using Dual-Pixel Data, ECCV, 2020.
[Cun et al. 20] Defocus Blur Detection via Depth Distillation, ECCV, 2020.
[Chen et al. 21] Learning a Non-blind Deblurring Network for Night Blurry Images, CVPR, 2021.
[Rozumnyi et al. 21] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects, CVPR, 2021.
[Xu et al. 21] Motion Deblurring with Real Events, ICCV, 2021.
[Cho et al. 21] Rethinking Coarse-to-Fine Approach in Single Image Deblurring, ICCV, 2021.
[Shang et al. 21] Bringing Events into Video Deblurring with Non-consecutively Blurry Frames, ICCV, 2021.
[Deng et al. 21] Multi-Scale Separable Network for Ultra-High-Definition Video Deblurring, ICCV, 2021.
[Hu et al 21] Pyramid Architecture Search for Real-Time Image Deblurring, ICCV, 2021.

Image deraining

[Li et al. 19] Single Image Deraining: A Comprehensive Benchmark Analysis, CVPR, 2019.
[Li et al. 21] A Comprehensive Benchmark Analysis of Single Image Deraining: Current Challenges and Future Perspectives, IJCV, 2021.
[Yang et al. 17] Deep Joint Rain Detection and Removal from a Single Image, CVPR, 2017.
[Zhang et al. 18] Density-aware Single Image De-raining using a Multi-stream Dense Network, CVPR, 2018.
[Hu et al. 19] Depth-attentional features for single-image rain removal, CVPR, 2019.
[Qian et al. 18] Attentive Generative Adversarial Network for Raindrop Removal from A Single Image, CVPR, 2018.
[Zhang et al. 19] Image de-raining using a conditional generative adversarial network, IEEE transactions on circuits and systems for video technology, 2019.
[Wei et al. 19] Semi-supervised Transfer Learning for Image Rain Removal, CVPR, 2019.
[Yang et al. 17] Deep Joint Rain Detection and Removal from a Single Image, CVPR, 2017.
[Hu et al. 17] Depth-Attentional Features for Single-Image Rain Removal, CVPR, 2019.

[Yasarla et al. 20] Syn2Real Transfer Learning for Image Deraining using Gaussian Processes, CVPR, 2020.
[Liu et al. 21] Unpaired Learning for Deep Image Deraining with Rain Direction Regularizer, ICCV, 2021.
[Zhou et al. 21] Image De-raining via Continual Learning, CVPR, 2021.
[Wang et al. 21] Multi-Decoding Deraining Network and Quasi-Sparsity Based Training, CVPR, 2021.
[Chen et al. 21] Robust Representation Learning with Feedback for Single Image Deraining, CVPR, 2021.
[Yue et al. 21] Semi-Supervised Video Deraining with Dynamical Rain Generator, CVPR, 2021.
[Yi et al. 21] Structure-Preserving Deraining with Residue Channel Prior Guidance, ICCV,2021.
[Huang et a. 21] Memory Oriented Transfer Learning for Semi-Supervised Image Deraining, CVPR, 2021.
[Chen et al. 21] Pre-Trained Image Processing Transformer, CVPR, 2021.
[Jiang et al. 21] Multi-Scale Progressive Fusion Network for Single Image Deraining, CVPR, 2020.
[Fu et al. 20] Lightweight Pyramid Networks for Image Deraining, IEEE Transactions on Neural Networks and Learning Systems, 2020.

Image dehazing

[Gui et al. 21] A Comprehensive Survey on Image Dehazing Based on Deep Learning, IJCAI, 2021.
[Cai et al. 16] DehazeNet: An End-to-End System for Single Image Haze Removal, IEEE, TIP, 2016.
[Ren et al. 20] Single Image Dehazing via Multi-scale Convolutional Neural Networks with Holistic Edges, IJCV, 2020. (Extension of the conference version at 2016)
[Li et al. 17] AOD-Net: All-in-One Dehazing Network, ICCV, 2017.
[Qin et al. 20] FFA-Net: Feature Fusion Attention Network for Single Image Dehazing, AAAI,2020.
[Zhang et al. 18] Densely Connected Pyramid Dehazing Network, CVPR, 2018.
[Ren et al. 18] Gated Fusion Network for Single Image Dehazing , CVPR, 2018.
[Qu et al. 19] Enhanced Pix2pix Dehazing Network, CVPR, 2019.
[Hong et al. 20] Distilling Image Dehazing With Heterogeneous Task Imitation, CVPR, 2020.
[Shao et al. 20] Domain Adaptation for Image Dehazing, CVPR, 2020.
[Engin et al. 18]Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing, ECCVW, 2018.
[Li et al. 20] Zero-Shot Image Dehazing, IEEE TIP, 2020.

[Wu et al. 21] Contrastive Learning for Compact Single Image Dehazing, CVPR, 2021.
[Shyam et al. 21] Towards Domain Invariant Single Image Dehazing, AAAI, 2021.
[Zheng et al. 21] Ultra-High-Defifinition Image Dehazing via Multi-Guided Bilateral Learning, CVPR, 2021.
[Chen et al. 21] PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors, CVPR, 2021.
[Zhao et al. 21] BidNet: Binocular Image Dehazing without Explicit Disparity Estimation, CVPR, 2021.
[Kar et al. 21] Transmission Map and Atmospheric Light Guided Iterative Updater Network for Single Image Dehazing, CVPR, 2021.
[Li et al. 20] Semi-Supervised Image Dehazing, IEEE TIP, 2020.
[Yi et al. 21] Two-Step Image Dehazing with Intra-domain and Inter-domain Adaptation, Arxiv, 2021.

Image/Video Super-Resolution

[Dong et al. 16] mage Super-Resolution Using Deep Convolutional Networks, ECCV,2016.(First deep learning-based method)
[Lim et al. 17] Enhanced Deep Residual Networks for Single Image Super-Resolution, CVPRW, 2017.
[Wang et al. 19] Deep Learning for Image Super-resolution: A Survey, IEEE TPAMI, 2021.
[Kim et al. 17] Accurate Image Super-Resolution Using Very Deep Convolutional Networks, CVPR, 2017.
[Tai et al. 17] MemNet: A Persistent Memory Network for Image Restoration, CVPR, 2017.
[Li et al. 18] Multi-scale Residual Network for Image Super-Resolution, ECCV, 2018.
[Zhang et al. 18] Image Super-Resolution Using Very Deep Residual Channel Attention Networks, ECCV, 2018.
[Zhang et al. 19] Residual Non-local Attention Networks for Image Restoration, ICLR, 2019.
[Dai et al. 19] Second-order Attention Network for Single Image Super-Resolution, CVPR, 2019.
[Han et al. 18] Image Super-Resolution via Dual-State Recurrent Networks, CVPR, 2018.
[Li et al. 18] Multi-scale Residual Network for Image Super-Resolution, ECCV, 2018.
[Ren et al. 18] Image Super Resolution Based on Fusing Multiple Convolution Neural Networks, CVPRW, 2017.
[Ahn et al. 18] Fast, accurate, and lightweight super-resolution with cascading residual network, ECCV, 2018.
[Zhang et al. 19] DCSR: Dilated Convolutions for Single Image Super-Resolution, IEEE TIP, 2019.
[Zhantg et al. 18] Residual Dense Network for Image Super-Resolution, CVPR, 2018.
[Hu et al. 19] Meta-SR: A Magnification-Arbitrary Network for Super-Resolution, CVPR, 2021.
[Chen et al. 21] Learning Continuous Image Representation with Local Implicit Image Function, CVPR, 2021.
[Lee et al. 20] Learning with Privileged Information for Efficient Image Super-Resolution, ECCV, 2020.
[Hu et al. 21] Towards Compact Single Image Super-Resolution via Contrastive Self-distillation, IJCAI, 2021.
[Cai et al. 19] Toward Real-World Single Image Super-Resolution: A New Benchmark and A New Model, ICCV, 2019.
[Wei et al. 20] Component Divide-and-Conquer for Real-World Image Super-Resolution, ECCV, 2021.
[Wang et al. 21] Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective, ICCV, 2021.
[Maeda et a. 20] Unpaired Image Super-Resolution using Pseudo-Supervision, CVPR, 2020.
[Shocher et al. 18] “Zero-Shot” Super-Resolution using Deep Internal Learning, CVPR, 2018.

[Wei et al. 21] Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective, ICCV, 2021.
[Zhang et al. 21] Unsupervised Real-world Image Super Resolution via Domain-distance Aware Training, CVPR, 2021.
[Sefi et al. 20] Blind Super-Resolution Kernel Estimation using an Internal-GAN, NIPS, 2020.
[Cheng et a. 20] Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning, ECCV, 2020.
[Sun et al. 21] Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution, CVPR, 2021.
[Wang et al. 21] Unsupervised Degradation Representation Learning for Blind Super-Resolution, CVPR, 2021.
[Son et al. 21] SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation, CVPR, 2021.
[Jo et al. 21] Tackling the Ill-Posedness of Super-Resolution through Adaptive Target Generation, CVPR, 2021.
[Mei et al. 21] Image Super-Resolution with Non-Local Sparse Attention, CVPR, 2021.
[Wang et al. 21] Learning a Single Network for Scale-Arbitrary Super-Resolution, ICCV, 2021.
[Wang et al. 21] Dual-Camera Super-Resolution with Aligned Attention Modules, CVPR, 2021.
[Chan et al. 21] BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond, ICCV, 2021.
[Yi et al. 21] Omniscient Video Super-Resolution, ICCV, 2021.
[Tian et al. 20] TDAN: Temporally Deformable Alignment Network for Video Super-Resolution, CVPR, 2020.
[Wang et al. 19] EDVR: Video Restoration With Enhanced Deformable Convolutional Networks, CVPRW, 2019.

Deep HDR imaging

[Wang et al. 21] Deep Learning for HDR Imaging:State-of-the-Art and Future Trends, IEEE TPAMI, 2021.
[Kalantrai et al. 17] Deep High Dynamic Range Imaging of Dynamic Scenes, Siggraph, 2017.
[Prabhakar et al. 19] A Fast, Scalable, and Reliable Deghosting Method for Extreme Exposure Fusion, ICCP, 2019.
[Wu et al. 18] Deep High Dynamic Range Imaging with Large Foreground Motions, ECCV, 2018.
[Yan et al. 21] Towards accurate HDR imaging with learning generator constraints, Neurocomputing, 2020.
[Yan et al. 19] Attention-guided Network for Ghost-free High Dynamic Range Imaging, CVPR, 2019.
[Rosh et al. 19] Deep Multi-Stage Learning for HDR With Large Object Motions, ICCP, 2019.
[Xu et al. 20] MEF-GAN: Multi-Exposure Image Fusion via Generative Adversarial Networks, TIP, 2020.
[Eilertsen et al. 17] HDR image reconstruction from a single exposure using deep CNNs, Siggraph, 2017.
[Santas et al. 20] Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss, Siggraph, 2020.
[Endo et al. 17] Deep Reverse Tone Mapping, Siggraph, 2017.
[Liu et al. 20] Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline, CVPR, 2020.

[Metzler] Deep Optics for Single-shot High-dynamic-range Imaging, CVPR, 2020.
[Kim et al. 18] A Multi-purpose Convolutional Neural Network for Simultaneous Super-Resolution and High Dynamic Range Image Reconstruction, ACCV, 2018.
[Kim et al. 19] Deep sr-itm: Joint learning of superresolution and inverse tone-mapping for 4k uhd hdr applications, ICCV,2019.
[Kim et al. 20] JSI-GAN: GAN-Based Joint Super-Resolution and Inverse Tone-Mapping with Pixel-Wise Task-Specific Filters for UHD HDR Video, AAAI, 2020.
[Kim et al. 20] End-to-End Differentiable Learning to HDR Image Synthesis for Multi-exposure Images, AAAI, 2020.
[Chen et al. 21] HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset, ICCV, 2021.
[Jiang et al. 21] HDR Video Reconstruction with Tri-Exposure Quad-Bayer Sensors, Arxiv, 2021.

Object detection

[Wu et al. 20] Recent advances in deep learning for object detection, Neurocomputing, 2020.
[Girshick et al. 15] Fast R-CNN, ICCV, 2015.
[Ghodrati et al. 15] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers, ICCV, 2015.
[Ren et al. 15] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2016.
[Kong et al. 16] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection, CVPR, 2016.
[He et al. 14] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, ECCV, 2014.
[Cai et al. 18] Cascade R-CNN: Delving into High Quality Object Detection, CVPR, 2018.
[Redmon et al. 16] You Only Look Once: Unified, Real-Time Object Detection, CVPR, 2016.
[Liu et al. 16] SSD: Single Shot MultiBox Detector, ECCV, 2016.
[Lin et al. 18] Focal Loss for Dense Object Detection (RetinaNet), CVPR, 2018.
[Redmon et al. 16] YOLO9000: Better, Faster, Stronger, Arxiv, 2017.
[Law et al. 19] CornerNet: Detecting Objects as Paired Keypoints,IJCV, 2019.
[He et al. 15] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, IEEE TPAMI, 2015.
[Long et al. 16] R-FCN: Object Detection via Region-based Fully Convolutional Networks, NIPS, 2016.
[Lin et al. 17] Feature Pyramid Networks for Object Detection, CVPR, 2017.
[He et al. 18] Mask R-CNN, ICCV, 2018.
[Chen et al. 19] Towards Accurate One-Stage Object Detection with AP-Loss, CVPR, 2019.

Generic detection

[Redmon et al. 18] YOLOv3: An Incremental Improvement, Arxiv, 2018.
[Chen et al. 19] Learning Efficient Object Detection Models with Knowledge Distillation, NIPS, 2019.
[Kang et al. 21] Instance-Conditional Knowledge Distillation for Object Detection, NIPS, 2021.
[Fang et al. 21] You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection, NIPS, 2021.
[Ge et al. 21] YOLOX: Exceeding YOLO Series in 2021, Arxiv, 2021.
[Pramanik et al. 22] Granulated RCNN and Multi-Class Deep SORT for Multi-Object Detection and Tracking, IEEE TETCI, 2022.
[Wang et al. 21] You Only Learn One Representation: Unified Network for Multiple Tasks, Arxiv, 2021.
[Wang et al. 19] Towards Universal Object Detection by Domain Attention, CVPR, 2019.
[Huang et al. 19] Mask Scoring R-CNN, CVPR, 2019.
[Guo et al. 21] Distilling Object Detectors via Decoupled Features, CVPR, 2021.
[Chen et al. 18] Domain Adaptive Faster R-CNN for Object Detection in the Wild, CVPR, 2018.
[Wang et al. 21] Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection, CVPR,2021.
[Zhou et al. 21] Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework, CVPR, 2021.
[Yang et al. 21] Interactive Self-Training with Mean Teachers for Semi-supervised Object Detection, CVPR, 2021.

Face detection

[Luo et al. 16] Understanding the Effective Receptive Field in Deep Convolutional Neural Networks, 2016.
[Tang et al. 18] PyramidBox: A Context-assisted Single Shot Face Detector, ECCV, 2018.
[Liu et al. 19] High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection, CVPR, 2019.
[Li et al. 20] Dsfd: Dual shot face detector， CVPR, 2019.
[Wang et al. 20] Hierarchical Pyramid Diverse Attention Networks for Face Recognition, CVPR, 2020.
[Huang et al. 21] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework, CVPR, 2021.
[Tong et al. 21] FACESEC: A Fine-grained Robustness Evaluation Framework for Face Recognition Systems, CVPR, 2021.
[Qiu et al. 21] SynFace: Face Recognition with Synthetic Data, ICCV, 2021.
[Song et al. 21] Occlusion Robust Face Recognition Based on Mask Learning With Pairwise Differential Siamese Network, ICCV, 2021.
[Fabbri et al. 21] MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking?, ICCV, 2021.

Pedestrain detection

[Wang et al. 18] Repulsion Loss: Detecting Pedestrians in a Crowd, CVPR, 2018.
[Zhang et al. 18] Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd, ECCV, 2018.
[Liu et al. 19] Adaptive NMS: Refining Pedestrian Detection in a Crowd, CVPR, 2019.
[Zhou et al. 20] Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems, ECCV, 2020.
[Wu et al. 20] Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians, CVPR, 2020.
[Wu et al. 20] Where, What, Whether: Multi-modal Learning Meets Pedestrian Detection, CVPR, 2020.
[Huang et al. 20] NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing, CVPR, 2020.
[Wang et al. 20] Learning Human-Object Interaction Detection using Interaction Points, CVPR, 2020.
[Sundararaman et al. 21] Tracking Pedestrian Heads in Dense Crowd, CVPR, 2020.
[Yan et al. 20] Anchor-Free Person Search, CVPR,2020.
[Gu et al. 18] Learning Region Features for Object Detection, ECCV, 2018.

Image Segmentation

[Long et al. 15] Fully Convolutional Networks for Semantic Segmentation, CVPR, 2015.
[Noh et al. 15] Learning Deconvolution Network for Semantic Segmentation, ICCV, 2015.
[Badrinarayanan et al. 16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, ICCV, 2016.
[Sun et al. 19] High-Resolution Representations for Labeling Pixels and Regions, CVPR, 2019.
[Zhao et al. 17] Pyramid Scene Parsing Network, CVPR, 2017.
[Chen et al. 18] Rethinking Atrous Convolution for Semantic Image Segmentation (Deeplabv3), CVPR, 2018.
[Visin et al. 16] ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation, CVPR, 2016.
[Visin et al. 15] ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks, NIPS, 2015.
[Chen et al. 16], Attention to Scale: Scale-aware Semantic Image Segmentation, CVPR, 2016. [Ghiasi et al. 16] Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation, ECCV, 2016. [Li et al. 18] Pyramid Attention Network for Semantic Segmentation, BMVC, 2018.
[Fu et al. 19] Dual Attention Network for Scene Segmentation, CVPR, 2019.
[Chen et al. 16] Attention to Scale: Scale-aware Semantic Image Segmentation, ICCV, 2016.
[Wang et al. 20] Deep High-Resolution Representation Learning for Visual Recognition, CVPR, 2020.
[He et al. 17] Mask R-CNN, ICCV, 2017.
[Yuan et al. 18] OCNet: Object Context Network for Scene Parsing, CVPR, 2019.
[Wang et al. 20] Dual Super-Resolution Learning for Semantic Segmentation, CVPR, 2020.
[Liu et al. 19] Structured Knowledge Distillation for Semantic Segmentation, CVPR, 2019.
[Wang et al. 20] Intra-class Feature Variation Distillation for Semantic Segmentation, ECCV, 2020.
[Xu et al. 18] Deep Affinity Net: Instance Segmentation via Affinity,ECCV, 2018.
[Quali et al. 20] Semi-Supervised Semantic Segmentation with Cross-Consistency Training, CVPR, 2020.
[Zhao et al. 19] Multi-source Domain Adaptation for Semantic Segmentation, NIPS, 2019.
[Chen et al. 19] CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency, CVPR, 2019.
[Choi et al. 19] Self-Ensembling with GAN-based Data Augmentation for Domain Adaptation in Semantic Segmentation, ICCV, 2019.
[Xu et al. 19] Self-Ensembling Attention Networks: Addressing Domain Shift for Semantic Segmentation, AAAI, 2019.
[Csurka et al. 21] Unsupervised Domain Adaptation for Semantic Image Segmentation: a Comprehensive Survey, Arxiv, 2021.
[Araslanov et al. 21] Self-supervised Augmentation Consistency for Adapting Semantic Segmentation, CVPR, 2021.
[Chan et al. 20] A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, IJCV, 2020.
[He et al. 21] Deep Learning based 3D Segmentation: A Survey, Arxiv, 2021.
[Minaee et al. 20] Image Segmentation Using Deep Learning: A Survey, Arxiv, 2020.

[Huang et al. 19] CCNet: Criss-Cross Attention for Semantic Segmentation, ICCV, 2019.
[Zhu et al. 19] Asymmetric Non-local Neural Networks for Semantic Segmentation, ICCV, 2019.
[Du et al. 19] SSF-DAN: Separated Semantic Feature based Domain Adaptation Network for Semantic Segmentation, ICCV, 2019.
[Ibrahim et al. 20] Semi-Supervised Semantic Image Segmentation with Self-correcting Networks, CVPR,2020.
[He et al. 21] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation, CVPR, 2021.
[Liu et al. 21] Source-Free Domain Adaptation for Semantic Segmentation, CVPR, 2021.
[Liu et al. 21] Domain Adaptation for Semantic Segmentation via Patch-Wise Contrastive Learning, ICCV, 2021.
[Chen et al. 18] ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes, CVPR, 2018.
[Wang et al. 21] Consistency Regularization with High-dimensional Non-adversarial Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation, AAAI, 2021.
[Kundu et al. 21] Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation, ICCV, 2021.
[Wang et al. 20] Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR, 2020.
[Sun et al. 21] ECS-Net: Improving Weakly Supervised Semantic Segmentation by Using Connections Between Class Activation Maps, ICCV, 2021.
[Chang et al. 20] Weakly-Supervised Semantic Segmentation via Sub-category Exploration, CVPR, 2020.

Computer vision with novel camera sensors (1)- Event-based vision

[Zhang et al. 20] Learning to See in the Dark with Events, ECCV, 2020.
[Rebacq et al. 19] High Speed and High Dynamic Range Video with an Event Camera, IEEE TPAMI (CVPR), 2019.
[Wang et al. 20] Event Enhanced High-Quality Image Recovery, ECCV, 2020.
[Wang et al. 19] Event-based High Dynamic Range Image and Very High Frame Rate Video Generation using Conditional Generative Adversarial Networks, CVPR, 2019.
[Wang et al. 20] EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning, CVPR, 2020.
[Kim et al. 22] Event-guided Deblurring of Unknown Exposure Time Videos, Arxiv, 2022.
[Mostafavi et al. 21] Event-Intensity Stereo: Estimating Depth by the Best of Both Worlds, ICCV, 2021.
[Wang et al. 21] Dual Transfer Learning for Event-based End-task Prediction via Pluggable Event to Image Translation, ICCV, 2021.
[Han et al. 21] EvIntSR-Net: Event Guided Multiple Latent Frames Reconstruction and Super-resolution, ICCV, 2021.
[Gehrig et al. 21] Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction, ICRA, 2021.
[Alonso et al. 19] EV-SegNet: Semantic Segmentation for Event-based Cameras, CVPR, 2019.
[Xu et al. 21] Motion Deblurring with Real Events, ICCV, 2021.

[Lin et al. 20] Learning Event-Driven Video Deblurring and Interpolation, ECCV, 2020.
[Federico et al. 21] Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy, CVPR, 2021.
[Jing et al. 21] Turning Frequency to Resolution: Video Super-resolution via Event Cameras, CVPR, 2021.
[Zou et al. 21] Learning to Reconstruct High Speed and High Dynamic Range Videos from Events, CVPR, 2021.
[Chen et al. 21] Indoor Lighting Estimation using an Event Camera, CVPR, 2021.
[Zhang et al. 21] Event-based Synthetic Aperture Imaging with a Hybrid Network, CVPR, 2021.
[Tulyakov et al. 21] Time Lens: Event-based Video Frame Interpolation, CVPR, 2021.
[Shang et al. 21] Bringing Events into Video Deblurring with Non-consecutively Blurry Frames, ICCV, 2021.
[Xu et al. 21] Motion Deblurring with Real Events, ICCV, 2021.
[Yu et al. 21] Training Weakly Supervised Video Frame Interpolation with Events, ICCV, 2021.
[Li et al. 21] Event Stream Super-Resolution via Spatiotemporal Constraint Learning, ICCV, 2021.
[Weng et al. 21] Event-based Video Reconstruction Using Transformer, ICCV, 2021.
[Zou et al. 21] EventHPE: Event-based 3D Human Pose and Shape Estimation, ICCV, 2021.
[Zhang et al. 21] Object Tracking by Jointly Exploiting Frame and Event Domain, ICCV, 2021.

Computer vision with novel camera sensors (II)

[Kuang et al. 19] Thermal Infrared Colorization via Conditional Generative Adversarial Network， ICCP, 2019.
[Nniaz et al.20] ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset, ECCV,2018.
[Li et al. 19] Segmenting Objects in Day and Night: Edge-Conditioned CNN for Thermal Image Semantic Segmentation, IEEE TNNLS, 2019.
[Wang et al. 20] Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification, AAAI,2020.
[Deng et al. 21] FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation, ICRA, 2021.
[Sun et al. 20] FuseSeg: Semantic Segmentation of Urban Scenes Based on RGB and Thermal Data Fusion, IEEE TASE, 2020.
[Zhang et al. 21] ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation, CVPR, 2021.

360 vision

[Wang et al. 20] BiFuse: Monocular 360◦ Depth Estimation via Bi-Projection Fusion, CVPR, 2020.
[Deng et al. 21] LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution, CVPR, 2021.
[Lee et al. 19] SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360◦ Images, CVPR, 2019.
[Cohen et al. 18] SPHERICAL CNNS, ICLR, 2018.
[Chen et al. 18] Cube Padding for Weakly-Supervised Saliency Prediction in 360◦ Videos, CVPR, 2018.
[Jeon et al. 18] Deep Upright Adjustment of 360 Panoramas Using Multiple Roll Estimations, ACCV, 2018
[Davidson et al. 20] 360o Camera Alignment via Segmentation, ECCV, 2020.
[Su et al. 18] Learning Spherical Convolution for Fast Features from 360° Imagery, NIPS, 2018.
[Tateno et al. 18] Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images, ECCV, 2018.

Thermal camera-based vision (reading list)

[Ghose et al. 19] Pedestrian Detection in Thermal Images using Saliency Maps, CVPR, 2019.
[Kieu et al. 20] Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery, ECCV, 2020.
[Li et al. 20] Full-Time Monocular Road Detection Using Zero-Distribution Prior of Angle of Polarization, ECCV, 2020.
[Choi et al. 20] Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification, CVPR, 2020.
[Wu et al. 21] Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification, CVPR, 2021.

[Chen et al. 21] Neural Feature Search for RGB-Infrared Person Re-Identification, CVPR, 2021.

[Ye et al. 21] Channel Augmented Joint Learning for Visible-Infrared Recognition, ICCV, 2021.
[Fu et al. 21] CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification, ICCV, 2021.

[Wei et al.21] Syncretic Modality Collaborative Learning for Visible Infrared Person Re-Identification, ICCV, 2021.

[Park et al. 21] Visible-Infrared Person Re-identification using Cross-Modal Correspondences, ICCV, 2021.

[Ye et al. 20] Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification, ECCV, 2020.

[Kieu et al. 20] Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery, ECCV, 2020.
[Wu et al. 20] Infrared-Visible Cross-Modal Person Re-Identification with an X Modality, AAAI, 2020.

[Wang et al. 19] RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment, ICCV, 2019.

[Feng et al. 20] Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification, IEEE TIP, 2020.
[Ye et al. 20] Cross-Modality Person Re-Identification via Modality-aware Collaborative Ensemble Learning, IEEE TIP, 2020.
[Wang et al. 19] Learning to Reduce Dual-level Discrepancy for Infrared-Visible Person Re-identification, CVPR, 2019.

360 vision (reading list)

[Jin et al. 20] Geometric Structure Based and Regularized Depth Estimation From 360◦ Indoor Imagery, CVPR, 2020.
[Deng et al. 21] LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution, CVPR, 2021.
[Sun et al. 21] HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features, CVPR, 2021.
[Yang et al. 21] Capturing Omni-Range Context for Omnidirectional Segmentation, CVPR, 2021.
[Yang et al. 21] Is Context-Aware CNN Ready for the Surroundings? Panoramic Semantic Segmentation in the Wild, IEEE TIP, 2021.
[Li et al. 21] Looking here or there? Gaze Following in 360-Degree Images, ICCV, 2021.
[Djilali et al. 21] Rethinking 360° Image Visual Attention Modelling with Unsupervised Learning, ICCV, 2021.
[Tran et al. 21] SSLayout360: Semi-Supervised Indoor Layout Estimation from 360◦ Panorama, CVPR, 2021.

Depth and Motion Estimation in Vision

Depth Estimation (Lecture notes)

[Ming et al. 21] Deep learning for monocular depth estimation: A review, Neurocomputing, 2021.
[Eigen et al.], “Depth Map Prediction from a Single Image using a Multi-Scale Deep Network”, NeurIPS, 2014.
[Laina et al. 16] Deeper depth prediction with fully convolutional residual networks, 3D vision,2016.
[Fu et al. 18] Deep Ordinal Regression Network for Monocular Depth Estimation, CVPR, 2018.
[Ren et al. 18] Pyramid Stereo Matching Network, CVPR, 2018.
[Jung et al. 17] Depth prediction from a single image with conditional adversarial networks, ICIP, 2017.

Motion Estimation (Optical Flow) (Lecture notes)

[Dosovitskiy et al. 15] Flownet: Learning optical flow with convolutional networks, ICCV, 2015.
[Ilg et al. 15] FlowNet 2.0: Evolution of Optical Flow Estimation With Deep Networks, CVPR, 2017.
[[Ilg et al. 18] Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation, ECCV, 2018.
[Ranjan et al. 17] Optical Flow Estimation using a Spatial Pyramid Network, CVPR, 2017.

Depth and Motion Estimation (Reading list)

[Xu et al. 18] Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation, CVPR, 2018.
[Godard et al. 17] Unsupervised Monocular Depth Estimation with Left-Right Consistency, CVPR, 2017.
[Kuznietsov et al. 17] Semi-Supervised Deep Learning for Monocular Depth Map Prediction, CVPR, 2017.
[Pilzer et al. 19] Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation, CVPR, 2019.
[Cun et al. 20] Defocus Blur Detection via Depth Distillation, ECCV, 2020.
[Ranftl et al. 21] Vision Transformers for Dense Prediction, CVPR, 2021.
[Meng et al. 19] SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception, CVPR, 2019.
[Liu et al. 21] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation, ICCV, 2021.
[Huynh et al. 20] Guiding Monocular Depth Estimation Using Depth-Attention Volume, ECCV, 2020.
[Watson et al. 20] Learning Stereo from Single Images, ECCV, 2020.
[YUan et al. 20], Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network with Optical Flow Guided Training, CVPR, 2020.
[Yan et al. 20] Optical Flow in Dense Foggy Scenes Using Semi-Supervised Learning, CVPR, 2020.
[Aleotti et al. 21] Learning optical flow from still images, CVPR, 2021.
[Luo et al. 21] UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning, CVPR, 2021.

Adversarial Robustness in Computer Vision

[Goodfellow et al. 15] Explaining and harnessing adversarial examples, ICLR, 2015.
[Szegedy et al. 14] Intriguing properties of neural networks, ICLR, 2014.
[Su et al. 17] One pixel attack for fooling deep neural networks, Arxiv, 2017.
[Karmon et al. 18] LaVAN: Localized and Visible Adversarial Noise, ICML, 2018.
[Xie et al. 17] Adversarial Examples for Semantic Segmentation and Object Detection, ICCV, 2017.
[Moosavi et al. 17] Universal adversarial perturbations, ICCV, 2017.
[Poursaeed et al. 18] Generative Adversarial Perturbations, CVPR, 2018.
[Chen et al. 18] ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector, ECML PKDD, 2018.
[Chao et al. 19] Generating Adversarial Examples with Adversarial Networks, IJCAI, 2019.
[Wang et al. 21] Psat-gan: Efficient adversarial attacks against holistic scene understanding, IEEE TIP, 2021.
[Carli et al. 17] Towards Evaluating the Robustness of Neural Networks, Axiv, 2017.
[Xiao et al. 18] SPATIALLY TRANSFORMED ADVERSARIAL EXAMPLES, ICLR,2018.

Reading list

[Zhou et al. 20] DaST: Data-Free Substitute Training for Adversarial Attacks, CVPR, 2020.
[Naseer et al. 20] A Self-supervised Approach for Adversarial Robustness, CVPR, 2020.
[Zi et al. 21] Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better, CVPR, 2021.
[Mahmood et al. 21] On the Robustness of Vision Transformers to Adversarial Examples, ICCV, 2021.
[Wang et al. 21] Feature Importance-aware Transferable Adversarial Attacks, ICCV, 2021.
[Mao et al. 20] Multitask Learning Strengthens Adversarial Robustness, ECCV, 2020
[Arnab et al. 18] On the Robustness of Semantic Segmentation Models to Adversarial Attacks, CVPR, 2018.
[He et al. 19] Biomedical Image Segmentation against Adversarial Attacks, AAAI, 2019.
[Joshi et al. 19] Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers, CVPR, 2019.
[Shamsabadi et al. 20] ColorFool: Semantic Adversarial Colorization, CVPR, 2020.

lwangust / AIAA-5027

Deep Learning for Visual Intelligence: Trends and Challenges

Course information

Course description

Grading policy

Tentative schedule

Reading list

DNN models in computer vision (VAEs, GANs, RNNs)

VAEs

GANs

Learning methods in computer vision

Knowledge transfer

Domain Adaptation

Semi-supervised learning

Image restoration and enhancement

Image Deblurring

Image deraining

Image dehazing

Image/Video Super-Resolution

Deep HDR imaging

Object detection

Generic detection

Face detection

Pedestrain detection

Image Segmentation

Computer vision with novel camera sensors (1)- Event-based vision

Computer vision with novel camera sensors (II)

360 vision

Thermal camera-based vision (reading list)

360 vision (reading list)

Depth and Motion Estimation in Vision

Depth Estimation (Lecture notes)

Motion Estimation (Optical Flow) (Lecture notes)

Depth and Motion Estimation (Reading list)

Adversarial Robustness in Computer Vision

Reading list

About