PeterouZh / Deep_Generative_Models

A collection of papers I am interested in.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GAN-Inversion

A collection of papers I am interested in.

Awesome

Renderer

Pybind

Video

Project

Face

3D

Tools

GUI

StyleGAN

Style transfer

Art

Anime

TOC

arXiv

Title Venue Code Year
Perceptual Gradient Networks arXiv:2105.01957 [cs] 2021
InfinityGAN: Towards Infinite-Resolution Image Synthesis arXiv:2104.03963 [cs] 2021
Aliasing Is Your Ally: End-to-End Super-Resolution from Raw Image Bursts arXiv:2104.06191 [cs, eess] 2021
StylePeople: A Generative Model of Fullbody Human Avatars arXiv:2104.08363 [cs] 2021
Cross-Domain and Disentangled Face Manipulation with 3D Guidance arXiv:2104.11228 [cs] 2021
On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation arXiv:2104.11222 [cs] 2021
FDA: Fourier Domain Adaptation for Semantic Segmentation arXiv:2004.05498 [cs] github 2020
StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image Editing CVPR 2021
Learning a Deep Reinforcement Learning Policy Over the Latent Space of a Pre-Trained GAN for Semantic Age Manipulation arXiv:2011.00954 [cs] 2021
GANalyze: Toward Visual Definitions of Cognitive Image Properties arXiv:1906.10112 [cs] 2019
On the “Steerability” of Generative Adversarial Networks arXiv:1907.07171 [cs] 2020
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation arXiv:2104.11116 [cs, eess] 2021
Unsupervised Image-to-Image Translation via Pre-Trained StyleGAN2 Network arXiv:2010.05713 [cs] github 2020
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort arXiv:2104.06490 [cs] 2021
Anycost GANs for Interactive Image Synthesis and Editing CVPR 2021
Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization arXiv:2104.05833 [cs] 2021
Positional Encoding as Spatial Inductive Bias in GANs arXiv:2012.05217 [cs] 2020
An Empirical Study of the Effects of Sample-Mixing Methods for Efficient Training of Generative Adversarial Networks arXiv:2104.03535 [cs.CV] 2021
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks CVPRW github 2020
Image Demoireing with Learnable Bandpass Filters arXiv:2004.00406 [cs] 2020
Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization arXiv:2103.04523 [cs] 2021
LatentCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions arXiv:2104.00820 [cs] 2021
Generating Images with Sparse Representations arXiv:2103.03841 [cs, stat] 2021
PiCIE: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering CVPR 2021
Dual Contrastive Loss and Attention for GANs arXiv:2103.16748 [cs.CV] 2021
Unsupervised Disentanglement of Linear-Encoded Facial Semantics CVPR 2021
Emergence of Object Segmentation in Perturbed Generative Models arXiv:1905.12663 [cs] github 2019
Unsupervised Discovery of DisentangledManifolds in GANs arXiv:2011.11842 [cs] github 2020
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery arXiv:2103.17249 [cs] github 2021
Few-Shot Semantic Image Synthesis Using StyleGAN Prior arXiv:2103.14877 [cs] 2021

Disentanglement

Title Venue Code Year
GANSpace: Discovering Interpretable GAN Controls arXiv:2004.02546 [cs] GANSpace 2020
Interpreting the Latent Space of GANs for Semantic Face Editing CVPR InterFaceGAN 2020
Closed-Form Factorization of Latent Semantics in GANs arXiv:2007.06600 [cs] sefa 2020
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation arXiv:2011.12799 [cs] StyleSpace 2020
Unsupervised Image Transformation Learning via Generative Adversarial Networks arXiv:2103.07751 [cs] github 2021
Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains arXiv:2010.05334 [cs] toonify 2020
WarpedGANSpace: Finding Non-Linear RBF Paths in GAN Latent Space arXiv:2109.13357 [cs] 2021
[Discovering Interpretable Latent Space Directions of GANs beyond Binary Attributes] CVPR 2021

Semantic hierarchy

Title Venue Code Year
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis arXiv:1911.09267 [cs] 2020

Inversion

Optimization

Title Venue Code Year
Image2StyleGAN++: How to Edit the Embedded Images? arXiv:1911.11544 [cs] 2020
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? ICCV 2019
Inverting The Generator Of A Generative Adversarial Network arXiv:1611.05644 [cs] 2016
Feature-Based Metrics for Exploring the Latent Space of Generative Models ICLRW 2018
Understanding Deep Image Representations by Inverting Them CVPR 2015
Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion arXiv:1912.08795 [cs, stat] DeepInversion 2020
IMAGINE: Image Synthesis by Image-Guided Model Inversion arXiv:2104.05895 [cs] 2021
Image Processing Using Multi-Code GAN Prior CVPR mGANprior 2020
Generative Visual Manipulation on the Natural Image Manifold ECCV 2018
GAN Dissection: Visualizing and Understanding Generative Adversarial Networks arXiv:1811.10597 [cs] 2018
GAN-Based Projector for Faster Recovery with Convergence Guarantees in Linear Inverse Problems arXiv:1902.09698 [cs, eess, stat] 2019
Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models CVPR 2020
Rewriting a Deep Generative Model arXiv:2007.15646 [cs] 2020
Transforming and Projecting Images into Class-Conditional Generative Networks arXiv:2005.01703 [cs] 2020
StyleGAN2 Distillation for Feed-Forward Image Manipulation arXiv:2003.03581 [cs.CV] 2020
On the “Steerability” of Generative Adversarial Networks arXiv:1907.07171 [cs] 2020
Unsupervised Discovery of DisentangledManifolds in GANs arXiv:2011.11842 [cs] 2020
PIE: Portrait Image Embedding for Semantic Control arXiv:2009.09485 [cs] 2020
GANSpace: Discovering Interpretable GAN Controls NeurIPS 2020
When and How Can Deep Generative Models Be Inverted? arXiv:2006.15555 [cs, stat] 2020
Style Intervention: How to Achieve Spatial Disentanglement with Style-Based Generators? arXiv:2011.09699 [cs] 2020
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation arXiv:2011.12799 [cs] 2020
Navigating the GAN Parameter Space for Semantic Image Editing arXiv:2011.13786 [cs] 2021
Mask-Guided Discovery of Semantic Manifolds in Generative Models arXiv:2105.07273 [cs] masked-gan-manifold 2021
StyleFlow: Attribute-Conditioned Exploration of StyleGAN-Generated Images Using Conditional Continuous Normalizing Flows arXiv:2008.02401 [cs] StyleFlow 2020
Disentangled Face Attribute Editing via Instance-Aware Latent Space Search arXiv:2105.12660 [cs] 2021
Barbershop: GAN-Based Image Compositing Using Segmentation Masks arXiv:2106.01505 [cs] 2021
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space arXiv:2002.03754 [cs, stat] GANLatentDiscovery 2020
Pivotal Tuning for Latent-Based Editing of Real Images arXiv:2106.05744 [cs] PTI 2021
Editing in Style: Uncovering the Local Semantics of GANs CVPR 2020
Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval arXiv:2107.06256 [cs] RetrieveInStyle 2021
StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation arXiv:2107.04331 [cs] 2021
A Simple Baseline for StyleGAN Inversion arXiv:2104.07661 [cs] 2021
From Continuity to Editability: Inverting GANs with Consecutive Images arXiv:2107.13812 [cs] 2021
AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning ACM Transactions on Graphics (Proc. SIGGRAPH) 2021
Talk-to-Edit: Fine-Grained Facial Editing via Dialog ICCV Talk-to-Edit 2021
Improved StyleGAN Embedding: Where Are the Good Latents? arXiv:2012.09036 [cs] II2S 2021
EditGAN: High-Precision Semantic Image Editing editGAN_release 2021
Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-Dimensional Latent Spaces from StyleGAN arXiv:2204.12696 [cs] 2022
Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing CVPR sam_inversion arXiv. 2022
Real Image Inversion via Segments arXiv:2110.06269 Chunkmogrify 2021

Encoder

Title Venue Code Year
GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution arXiv:2012.00739 [cs] GLEAN 2020
Swapping Autoencoder for Deep Image Manipulation arXiv:2007.00653 [cs] github 2020
In-Domain GAN Inversion for Real Image Editing ECCV 2020
ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement arXiv:2104.02699 [cs] ReStyle 2021
Interpreting the Latent Space of GANs for Semantic Face Editing CVPR 2020
Face Identity Disentanglement via Latent Space Mapping arXiv:2005.07728 [cs] 2020
Collaborative Learning for Faster StyleGAN Embedding arXiv:2007.01758 [cs] 2020
Unsupervised Discovery of DisentangledManifolds in GANs arXiv:2011.11842 [cs] 2020
Generative Hierarchical Features from Synthesizing Images arXiv:2007.10379 [cs] 2020
One Shot Face Swapping on Megapixels arXiv:2105.04932 [cs] 2021
GAN Prior Embedded Network for Blind Face Restoration in the Wild 2021
Adversarial Latent Autoencoders CVPR ALAE 2020
Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation arXiv:2008.00951 [cs] psp 2021
Designing an Encoder for StyleGAN Image Manipulation arXiv:2102.02766 [cs] encoder4editing 2021
A Latent Transformer for Disentangled and Identity-Preserving Face Editing arXiv:2106.11895 [cs] 2021
ShapeEditer: A StyleGAN Encoder for Face Swapping arXiv:2106.13984 [cs] 2021
Force-in-Domain GAN Inversion arXiv:2107.06050 [cs, eess] 2021
StyleFusion: A Generative Model for Disentangling Spatial Segments arXiv:2107.07437 [cs] 2021
Perceptually Validated Precise Local Editing for Facial Action Units with StyleGAN arXiv:2107.12143 [cs] 2021
StyleGAN2 Distillation for Feed-Forward Image Manipulation arXiv:2003.03581 [cs.CV] 2020
GAN Inversion for Out-of-Range Images with Geometric Transformations ICCV 2021
❤️ DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing arXiv:2109.10737 [cs] DyStyle 2021
High-Fidelity GAN Inversion for Image Attribute Editing arXiv:2109.06590 [cs] 2021
❤️ Few-Shot Knowledge Transfer for Fine-Grained Cartoon Face Generation arXiv:2007.13332 [cs] 2020
❤️ HyperInverter: Improving StyleGAN Inversion via Hypernetwork CVPR HyperInverter arXiv. 2022
[High-Fidelity GAN Inversion with Padding Space] ECCV padinv 2022

Hybrid optimization

Title Venue Code Year
Generative Visual Manipulation on the Natural Image Manifold ECCV 2018
Semantic Photo Manipulation with a Generative Image Prior ACM Transactions on Graphics 2019
Seeing What a GAN Cannot Generate arXiv:1910.11626 [cs, eess] 2019
In-Domain GAN Inversion for Real Image Editing ECCV 2020

Without optimization

Title Venue Code Year
Closed-Form Factorization of Latent Semantics in GANs arXiv:2007.06600 [cs] 2020
GAN “Steerability” without Optimization arXiv:2012.05328 [cs] 2021
Low-Rank Subspaces in GANs arXiv:2106.04488 [cs] 2021
LARGE: Latent-Based Regression through GAN Semantics arXiv:2107.11186 [cs] 2021
Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation ICCV 2021
Controllable and Compositional Generation with Latent-Space Energy-Based Models NeurIPS LACE 2021
Do Generative Models Know Disentanglement? Contrastive Learning Is All You Need arXiv:2102.10543 [cs] DisCo 2021

DGP

Title Venue Code Year
✔️ Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation ECCV DGP 2020
✔️ PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models CVPR PULSE 2020
✔️ GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution arXiv:2012.00739 [cs] 2020
Unsupervised Portrait Shadow Removal via Generative Priors arXiv:2108.03466 [cs] 2021
Towards Real-World Blind Face Restoration with Generative Facial Prior CVPR GFPGAN 2021
Towards Vivid and Diverse Image Colorization with Generative Color Prior ICCV 2021
Self-Validation: Early Stopping for Single-Instance Deep Generative Priors arXiv:2110.12271 [cs.CV] 2021
One-Shot Generative Domain Adaptation arXiv:2111.09876 [cs] 2021
❤️ Time-Travel Rephotography ACM Transactions on Graphics code 2021

Cls

Title Venue Code Year
Contrastive Model Inversion for Data-Free Knowledge Distillation arXiv:2105.08584 [cs] 2021
Generative Models as a Data Source for Multiview Representation Learning arXiv:2106.05258 [cs] 2021
Inverting and Understanding Object Detectors arXiv:2106.13933 [cs] 2021
Deep Neural Networks Are Surprisingly Reversible: A Baseline for Zero-Shot Inversion arXiv:2107.06304 [cs] 2021
Ensembling with Deep Generative Views arXiv:2104.14551 [cs] 2021

Change pose implicitly

Title Venue Code Year
On the “Steerability” of Generative Adversarial Networks arXiv:1907.07171 [cs] 2020
Interpreting the Latent Space of GANs for Semantic Face Editing CVPR 2020
GANSpace: Discovering Interpretable GAN Controls arXiv:2004.02546 [cs] GANSpace 2020
Closed-Form Factorization of Latent Semantics in GANs arXiv:2007.06600 [cs] sefa 2020
StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN arXiv:2111.01619 [cs] 2021
Using Latent Space Regression to Analyze and Leverage Compositionality in GANs ICLR 2021

Survey

Title Venue Code Year
GAN Inversion: A Survey arXiv:2101.05278 [cs] 2021

GANs

NeurIPS 2021

Title Venue Code Year
Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training NeurIPS 2021

Theory

Title Venue Code Year
Towards a Better Global Loss Landscape of GANs NeurIPS 2020
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins arXiv:1812.11039 [cs, math, stat] 2021

Regs

Title Venue Code Year
The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement ECCV 2020

Detection

Title Venue Code Year
Self-Supervised Object Detection via Generative Image Synthesis arXiv:2110.09848 [cs] 2021

StyleGANs

Title Venue Code Year
A Style-Based Generator Architecture for Generative Adversarial Networks CVPR 2019
Analyzing and Improving the Image Quality of StyleGAN arXiv:1912.04958 [cs, eess, stat] 2019
Training Generative Adversarial Networks with Limited Data arXiv:2006.06676 [cs, stat] 2020
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data NeurIPS 2021
Alias-Free Generative Adversarial Networks arXiv:2106.12423 [cs, stat] alias-free-gan, rep2 2021
Transforming the Latent Space of StyleGAN for Real Face Editing arXiv:2105.14230 [cs] TransStyleGAN 2021
MobileStyleGAN: A Lightweight Convolutional Neural Network for High-Fidelity Image Synthesis arXiv:2104.04767 [cs, eess] MobileStyleGAN 2021
Few-Shot Image Generation via Cross-Domain Correspondence CVPR few-shot-gan-adaptation 2021
EigenGAN: Layer-Wise Eigen-Learning for GANs arXiv:2104.12476 [cs, stat] EigenGAN 2021
❤️ Toward Spatially Unbiased Generative Models ICCV toward_spatial_unbiased 2021
Interpreting Generative Adversarial Networks for Interactive Image Generation arXiv:2108.04896 [cs] 2021
Explaining in Style: Training a GAN to Explain a Classifier in StyleSpace ICCV explaining-in-style 2021
Projected GANs Converge Faster NeurIPS projected_gan 2021
Towards Faster and Stabilized GAN Training for High-Fidelity Few-Shot Image Synthesis ICLR2021 github 2021
❤️ Ensembling Off-the-Shelf Models for GAN Training arXiv:2112.09130 [cs] vision-aided-gan 2021
❤️ StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets arXiv:2202.00273 [cs] 2022
When, Why, and Which Pretrained GANs Are Useful? ICLR 2022
A U-Net Based Discriminator for Generative Adversarial Networks CVPR 2020

Transformer

Title Venue Code Year
Compositional Transformers for Scene Generation NeurIPS 2021
❤️ GAN-Supervised Dense Visual Alignment arXiv:2112.05143 [cs] gangealing 2021
Improved Transformer for High-Resolution GANs arXiv:2106.07631 [cs] 2021
MaskGIT: Masked Generative Image Transformer arXiv:2202.04200 [cs] 2022
StyleSwin: Transformer-Based GAN for High-Resolution Image Generation CVPR 2022

SinGAN

Title Venue Code Year
ExSinGAN: Learning an Explainable Generative Model from a Single Image arXiv:2105.07350 [cs] 2021

Video

Title Venue Code Year
❤️ Diverse Generation from a Single Video Made Possible arXiv:2109.08591 [cs] 2021

GANs

Title Venue Code Year
Differentiable Augmentation for Data-Efficient GAN Training arXiv:2006.10738 [cs] 2020
Sampling Generative Networks arXiv:1609.04468 [cs, stat] 2016
Combining Transformer Generators with Convolutional Discriminators arXiv:2105.10189 [cs] 2021
Improving Generation and Evaluation of Visual Stories via Semantic Consistency arXiv:2105.10026 [cs] 2021
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation CVPR 2021
Data-Efficient Instance Generation from Instance Discrimination arXiv:2106.04566 [cs] 2021
Styleformer: Transformer Based Generative Adversarial Networks with Style Vector arXiv:2106.07023 [cs, eess] 2021
FBC-GAN: Diverse and Flexible Image Synthesis via Foreground-Background Composition arXiv:2107.03166 [cs] 2021
ViTGAN: Training GANs with Vision Transformers arXiv:2107.04589 [cs, eess] 2021
Learning Efficient GANs for Image Translation via Differentiable Masks and Co-Attention Distillation arXiv:2011.08382 [cs] 2021
CGANs with Auxiliary Discriminative Classifier arXiv:2107.10060 [cs] 2021
A Good Image Generator Is What You Need for High-Resolution Video Synthesis ICLR 2021
Dual Projection Generative Adversarial Networks for Conditional Image Generation ICCV 2021
Your GAN Is Secretly an Energy-Based Model and You Should Use Discriminator Driven Latent Sampling arXiv:2003.06060 [cs, stat] CGAN-DDLS 2021
Manifold-Preserved GANs arXiv:2109.08955 [cs] 2021
Latent Reweighting, an Almost Free Improvement for GANs arXiv:2110.09803 [cs] 2021
STRANSGAN: AN EMPIRICAL STUDY ON TRANS- FORMER IN GANS arXiv:2110.13107 [cs.CV] 2021
Self-Supervised GANs with Label Augmentation arXiv:2106.08601 [cs] 2021
Regularizing Generative Adversarial Networks under Limited Data CVPR github 2021

cGANs

Title Venue Code Year
Unbiased Auxiliary Classifier GANs with MINE arXiv:2006.07567 [cs] 2020
Twin Auxiliary Classifiers GAN arXiv:1907.02690 [cs, stat] 2019

Finetune

Title Venue Code Year
FreezeG github
Freeze the Discriminator: A Simple Baseline for Fine-Tuning GANs arXiv:2002.10964 [cs, stat] FreezeD 2020
Fine-Tuning StyleGAN2 For Cartoon Face Generation arXiv:2106.12445 [cs, eess] Cartoon-StyleGAN 2021
Transferring GANs: Generating Images from Limited Data ECCV 2018
Image Generation From Small Datasets via Batch Statistics Adaptation ICCV 2019
MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images CVPR 2020

Compression

Title Venue Code Year
GAN Compression: Efficient Architectures for Interactive Conditional GANs CVPR 2020
Online Multi-Granularity Distillation for GAN Compression ICCV 2021
Revisiting Discriminator in GAN Compression: A Generator-Discriminator Cooperative Compression Scheme arXiv:2110.14439 [cs] GCC 2021

Detection fake

Title Venue Code Year
Robust Attentive Deep Neural Network for Exposing GAN-Generated Faces arXiv:2109.02167 [cs] 2021

Segmentation

Title Venue Code Year
Labels4Free: Unsupervised Segmentation Using StyleGAN arXiv:2103.14968 [cs] 2021
BigDatasetGAN: Synthesizing ImageNet with Pixel-Wise Annotations ArXiv:2201.04684 [Cs] arXiv. 2022

Datasets

Title Venue Code Year
Gradient-Based Learning Applied to Document Recognition Proceedings of the IEEE [mnist] 1998
Learning Multiple Layers of Features from Tiny Images [cifar] 2009
ImageNet: A Large-Scale Hierarchical Image Database CVPR [ImageNet] 2009
Learning Hybrid Image Templates (HIT) by Information Projection TPAMI AnimalFace 2012
A Style-Based Generator Architecture for Generative Adversarial Networks CVPR FFHQ 2019
StarGAN v2: Diverse Image Synthesis for Multiple Domains CVPR AFHQ 2020
Automated Flower Classification over a Large Number of Classes 102Flowers 2008
XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings ICML CartoonSet 2018
Anime Faces Sourced from Safebooru Resized to 256x256 Kaggle AnimeFace
Facial Expressions of Manga (Japanese Comic) Character Faces Kaggle MangaExpressions
Open-Source Cartoon Dataset Kaggle photo2cartoon
Simpsons Faces: A Lot of Images of Your Favourite Characters Kaggle SimpsonsFaces
Bitmoji Faces Kaggle BitmojiFaces
BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation AAHQ 2021
❤️ Fake It Till You Make It: Face Analysis in the Wild Using Synthetic Data Alone ICCV FaceSynthetics 2021
Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models CVPR chair 2014
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification CVPR [CompCars] arXiv. 2015
The ArtBench Dataset: Benchmarking Generative Models with Artworks 2022
DwNet: Dense Warp-Based Network for Pose-Guided Human Video Generation BMVC Fashion 2019
MoCoGAN: Decomposing Motion and Content for Video Generation CVPR [Tai-Chi] 2018
Text2Human: Text-Driven Controllable Human Image Generation ACM Transactions on Graphics (TOG) DeepFashion-MultiModal 2022

alias (ref)

Title Venue Code Year
Alias-Free Generative Adversarial Networks arXiv:2106.12423 [cs, stat] 2021
On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation arXiv:2104.11222 [cs] 2021

Texture

Tiles

Title Venue Code Year
TileGAN: Synthesis of Large-Scale Non-Homogeneous Textures ACM Transactions on Graphics 2019
InsetGAN for Full-Body Image Generation arXiv:2203.07293 [cs] 2022
Collaging Class-Specific GANs for Semantic Image Synthesis ICCV 2021

GAN application

Title Venue Code Year
SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color arXiv:1902.06838 [cs] 2019
Semantic Text-to-Face GAN -ST^2FG arXiv:2107.10756 [cs] 2021
CRD-CGAN: Category-Consistent and Relativistic Constraints for Diverse Text-to-Image Generation arXiv:2107.13516 [cs] 2021

Image-to-Image Translation

Title Venue Code Year
Image-to-Image Translation with Conditional Adversarial Networks CVPR pix2pix 2017
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs CVPR pix2pix-HD 2018
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks ICCV CycleGAN 2017
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation CVPR 2018
StarGAN v2: Diverse Image Synthesis for Multiple Domains CVPR 2020
Multimodal Unsupervised Image-to-Image Translation arXiv:1804.04732 [cs, stat] MUNIT 2018
High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network arXiv:2105.09188 [cs] 2021
MixerGAN: An MLP-Based Architecture for Unpaired Image-to-Image Translation arXiv:2105.14110 [cs] 2021
GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation (Works for Videos Too!) arXiv:2106.06561 [cs] 2021
❤️ Sketch Your Own GAN ICCV 2021
Contrastive Learning for Unpaired Image-to-Image Translation ECCV contrastive-unpaired-translation 2020
The Animation Transformer: Visual Correspondence via Segment Matching arXiv:2109.02614 [cs] 2021
Image Synthesis via Semantic Composition ICCV 2021
You Only Need Adversarial Supervision for Semantic Image Synthesis arXiv:2012.04781 [cs, eess] 2020

Style transfer

Title Venue Code Year
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization ICCV 2017
Texture Synthesis Using Convolutional Neural Networks NeurIPS 2015
A Neural Algorithm of Artistic Style arXiv:1508.06576 [cs, q-bio] 2015
Image Style Transfer Using Convolutional Neural Networks CVPR 2016
Perceptual Losses for Real-Time Style Transfer and Super-Resolution ECCV 2016
Texture Networks: Feed-Forward Synthesis of Textures and Stylized Images ICML 2016
Attention-Based Stylisation for Exemplar Image Colourisation arXiv:2105.01705 [cs, eess] 2021
StyleBank: An Explicit Representation for Neural Image Style Transfer Stylebank 2017
Rethinking and Improving the Robustness of Image Style Transfer arXiv:2104.05623 [cs, eess] 2021
Paint Transformer: Feed Forward Neural Painting with Stroke Prediction ICCV 2021
❤️ AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer ICCV 2021
ZiGAN: Fine-Grained Chinese Calligraphy Font Generation via a Few-Shot Style Transfer Approach arXiv:2108.03596 [cs] 2021
Domain-Aware Universal Style Transfer ICCV 2021
Aesthetics and Neural Network Image Representations arXiv:2109.08103 [cs, eess, q-bio] 2021
❤️ Collaborative Distillation for Ultra-Resolution Universal Style Transfer CVPR collaborative-distillation 2020
Adaptive Convolutions for Structure-Aware Style Transfer CVPR ada-conv-pytorch 2021
CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer ECCV CCPL arXiv. 2022

Metric & perceptual loss

Title Venue Code Year
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric arXiv:1801.03924 [cs] lpips-pytorch 2018
Generating Images with Perceptual Similarity Metrics Based on Deep Networks NeurIPS Perceptual Similarity 2016
Generic Perceptual Loss for Modeling Structured Output Dependencies CVPR [random] 2021
Inverting Adversarially Robust Networks for Image Synthesis arXiv:2106.06927 [cs] 2021
Demystifying MMD GANs ICLR Kernel Inception Distance (KID) 2018
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium NeurIPS Fréchet Inception Distance (FID) 2017
Improved Techniques for Training GANs NeurIPS Inception Score (IS) 2016
High-Fidelity Performance Metrics for Generative Models in PyTorch torch-fidelity 2020
Reliable Fidelity and Diversity Metrics for Generative Models ICML generative-evaluation-prdc 2020
The Contextual Loss for Image Transformation with Non-Aligned Data ECCV contextualLoss arXiv. 2018
Maintaining Natural Image Statistics with the Contextual Loss ArXiv:1803.04626 [Cs] 2018

Spectrum

Title Venue Code Year
Reproducibility of "FDA: Fourier Domain Adaptation ForSemantic Segmentation arXiv:2104.14749 [cs] 2021
A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images Detection CVPR 2021

Weakly Supervised Object Localization

Title Venue Code Year
TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization arXiv:2103.14862 [cs] 2021
Finding an Unsupervised Image Segmenter in Each of Your Deep Generative Models arXiv:2105.08127 [cs] 2021
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP arXiv:2107.12518 [cs] 2021

Implicit Neural Representations

Title Venue Code Year
DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation arXiv:1901.05103 [cs] 2019
Occupancy Networks: Learning 3D Reconstruction in Function Space arXiv:1812.03828 [cs] 2019
❤️ Neural Image Representations for Multi-Image Fusion and Layer Separation arXiv:2108.01199 [cs] 2021
Learning Continuous Image Representation with Local Implicit Image Function CVPR 2021

Energy

Title Venue Code Year
How to Train Your Energy-Based Models ArXiv:2101.03288 arXiv. 2021
Your Classifier Is Secretly an Energy Based Model and You Should Treat It Like One ICLR JEM arXiv. 2020
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models NeurIPS Generative-Visual-Prompt 2022

Flow

Title Venue Code Year
Variational Inference with Normalizing Flows ICML 2015
Density Estimation Using Real NVP ICLR arXiv. 2017

ChatGPT

Diffusion

Generation

Title Venue Code Year
Understanding Diffusion Models: A Unified Perspective arXiv:2208.11970 2022
1 Deep Unsupervised Learning Using Nonequilibrium Thermodynamics ArXiv:1503.03585 [Cond-Mat, q-Bio, Stat] arXiv. 2015
2 Generative Modeling by Estimating Gradients of the Data Distribution NeurIPS 2019
3 Denoising Diffusion Probabilistic Models arXiv:2006.11239 [cs, stat] diffusion, denoising-diffusion-pytorch 2020
Denoising Diffusion Implicit Models ICLR [DDIM] arXiv. 2021
Improved Denoising Diffusion Probabilistic Models ArXiv:2102.09672 [Cs.LG] improved-diffusion 2021
Score-Based Generative Modeling through Stochastic Differential Equations ICLR 2021
35 steps Elucidating the Design Space of Diffusion-Based Generative Models ArXiv:2206.00364 [Cs, Stat] k-diffusion arXiv. 2022
10 steps DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps ArXiv:2206.00927 [Cs, Stat] dpm-solver arXiv. 2022
❤️ SDEdit: Image Synthesis and Editing with Stochastic Differential Equations arXiv:2108.01073 [cs] SDEdit 2021
D2C: Diffusion-Denoising Models for Few-Shot Conditional Generation arXiv:2106.06819 [cs] 2021
Label-Efficient Semantic Segmentation with Diffusion Models ddpm-segmentation 2021
Analog Bits: Generating Discrete Data Using Diffusion Models with Self-Conditioning bit-diffusion 2022
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise arXiv:2208.09392 Cold-Diffusion-Models 2022
Diffusion-GAN: Training GANs with Diffusion arXiv:2206.02262 Diffusion-GAN 2022
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs ICLR denoising-diffusion-gan 2022
Score-Based Generative Modeling in Latent Space NeurIPS LSGM arXiv. 2021
Compositional Visual Generation with Composable Diffusion Models ECCV Composable-Diffusion arXiv. 2022
Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling ECCV PDS 2022
Diffusion Autoencoders: Toward a Meaningful and Decodable Representation CVPR diffae 2022
Cascaded Diffusion Models for High Fidelity Image Generation arXiv:2106.15282 2021

Inversion

Title Venue Code Year
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models ICCV ilvr_adm arXiv. 2021
Diffusion Models Beat GANs on Image Synthesis arXiv:2105.05233 [cs, stat] guided-diffusion 2021
An Image Is Worth One Word: Personalizing Text-to-Image Generation Using Textual Inversion arXiv:2208.01618 textual_inversion 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation arXiv:2208.12242 dreambooth,
Dreambooth-Stable-Diffusion
2022
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation CVPR DiffusionCLIP 2022

Text-to-image

Title Venue Code Year
Cross-Modal Contrastive Learning for Text-to-Image Generation CVPR 2021
Zero-Shot Text-to-Image Generation ICML arXiv. 2021
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance ArXiv:2204.08583 2022
Learning Transferable Visual Models From Natural Language Supervision ICML CLIP
open_clip
2021
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models arXiv:2112.10741 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents DALLE2-pytorch 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding imagen-pytorch, Imagen-pytorch 2022
[Scaling Autoregressive Models for Content-Rich Text-to-Image Generation] parti
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers arXiv:2204.14217 2022
High-Resolution Image Synthesis with Latent Diffusion Models CVPR stable-diffusion, latent-diffusion, stable-diffusion arXiv. 2022
Prompt-to-Prompt Image Editing with Cross Attention Control arXiv:2208.01626 CrossAttentionControl 2022
SINE: SINgle Image Editing with Text-to-Image Diffusion Models arXiv:2212.04489 SINE 2022

Image_to_image

Title Venue Code Year
Palette: Image-to-Image Diffusion Models arXiv:2111.05826 Palette-Image-to-Image-Diffusion-Models 2022
Image Super-Resolution via Iterative Refinement arXiv:2104.07636 Image-Super-Resolution-via-Iterative-Refinement 2021

3D

Title Venue Code Year
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation arXiv:2211.09869 2022
Magic3D: High-Resolution Text-to-3D Content Creation arXiv:2211.10440 2022

Detection

Title Venue Code Year
DiffusionInst: Diffusion Model for Instance Segmentation arXiv:2212.02773 DiffusionInst 2022

3D & NeRF

Title Venue Code Year
Efficient Ray Tracing of Volume Data ACM Transactions on Graphics 1990
Surface Light Fields for 3D Photography SIGGRAPH 2000
NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections arXiv:2008.02268 [cs] nerfw 2021
Modulated Periodic Activations for Generalizable Local Functional Representations arXiv:2104.03960 [cs] 2021
Neural Volume Rendering: NeRF And Beyond arXiv:2101.05204 [cs] awesome-NeRF 2021
Editing Conditional Radiance Fields arXiv:2105.06466 [cs] editnerf 2021
Recursive-NeRF: An Efficient and Dynamically Growing NeRF arXiv:2105.09103 [cs] 2021
MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo arXiv:2103.15595 [cs] mvsnerf 2021
Depth-Supervised NeRF: Fewer Views and Faster Training for Free arXiv:2107.02791 [cs] 2021
Rethinking Positional Encoding arXiv:2107.02561 [cs] 2021
Nerfies: Deformable Neural Radiance Fields arXiv:2011.12948 nerfies 2020
Self-Calibrating Neural Radiance Fields ICCV 2021
Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering arXiv:2106.02634 [cs] 2021

Sine

Title Venue Code Year Cite
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains arXiv:2006.10739 [cs] 2020
Implicit Neural Representations with Periodic Activation Functions NeurIPS 2020
Modulated Periodic Activations for Generalizable Local Functional Representations arXiv:2104.03960 [cs] 2021
Learned Initializations for Optimizing Coordinate-Based Neural Representations arXiv:2012.02189 [cs] nerf-meta 2021
Seeing Implicit Neural Representations as Fourier Series arXiv:2109.00249 [cs] 2021

INR

Title Venue Code Year Cite
Adversarial Generation of Continuous Images arXiv:2011.12026 [cs] inr-gan 2020
Image Generators with Conditionally-Independent Pixel Synthesis arXiv:2011.13775 [cs] CIPS 2020
A Structured Dictionary Perspective on Implicit Neural Representations arXiv:2112.01917 [cs] 2021

3D & NeRF GANs

Title Venue Code Year Cite
✔️ HoloGAN: Unsupervised Learning of 3D Representations from Natural Images ICCV 2019
BlockGAN: Learning 3D Object-Aware Scene Representations from Unlabelled Images NeurIPS 2020
✔️ GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis arXiv:2007.02442 [cs] 2021
✔️ Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis arXiv:2012.00926 [cs] pi-GAN 2021 19
✔️ GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields CVPR giraffe 2021
✔️ GIRAFFE HD: A High-Resolution 3D-Aware Generative Model CVPR 2022
❤️ StyleNeRF: A Style-Based 3D-Aware Generator for High-Resolution Image Synthesis arXiv:2110.08985 [cs, stat] 2021
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields arXiv:2103.17269 [cs] 2021
✔️ GNeRF: GAN-Based Neural Radiance Field without Posed Camera arXiv:2103.15606 [cs] gnerf 2021
❤️ Unconstrained Scene Generation with Locally Conditioned Radiance Fields ICCV ml-gsn 2021
Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering ICCV 2021
✔️ A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis NeurIPS 2021
✔️ Generative Occupancy Fields for 3D Surface-Aware Image Synthesis NeurIPS 2021
✔️ Efficient Geometry-Aware 3D Generative Adversarial Networks arXiv:2112.07945 [cs] eg3d 2021
✔️ 3D-Aware Image Synthesis via Learning Structural and Textural Representations arXiv:2112.10759 [cs] VolumeGAN 2021
✔️ GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation arXiv:2112.08867 [cs] GRAM 2021
CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs CVPR 2022
✔️ Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images CVPR 2022
✔️ Multi-View Consistent Generative Adversarial Networks for 3D-Aware Image Synthesis CVPR MVCGAN 2022
✔️ FENeRF: Face Editing in Neural Radiance Fields CVPR FENeRF 2022
✔️ IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-Aware Portrait Synthesis arXiv:2205.15517 2022
✔️ EpiGRAF: Rethinking Training of 3D GANs ArXiv:2206.10535 [Cs] epigraf arXiv. 2022
https://github.com/rethinking-3d-gans/code
❤️ Generative Multiplane Images: Making a 2D GAN 3D-Aware ECCV ml-gmpi arXiv. 2022
GAUDI: A Neural Architect for Immersive 3D Scene Generation ArXiv:2207.13751 [Cs] ml-gaudi arXiv. 2022
Deep Deformable 3D Caricatures with Learned Shape Control SIGGRAPH DeepDeformable3DCaricatures Vancouver BC Canada: ACM. 2022
Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis ECCV SURF-GAN arXiv. 2022
Pix2NeRF: Unsupervised Conditional $\pi$-GAN for Single Image to Neural Radiance Fields Translation arXiv:2202.13162 [cs] 2022
[Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation] TT-GNeRF

Diffusion

Title Venue Code Year Cite
DiffuStereo: High Quality Human Reconstruction via Diffusion-Based Stereo Using Sparse Cameras ECCV DiffuStereo arXiv. 2022
DiffRF: Rendering-Guided 3D Radiance Field Diffusion arXiv:2212.01206 DiffRF 2022

NeRF large scene

Title Venue Code Year Cite
Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs CVPR 2022
Block-NeRF: Scalable Large Scene Neural View Synthesis CVPR BlockNeRFPytorch arXiv. 2022
IBRNet: Learning Multi-View Image-Based Rendering arXiv:2102.13090 [cs] IBRNet 2021

NeRF

Title Venue Code Year Cite
Ray Tracing Volume Densities SIGGRAPH 1984
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis ECCV nerf-pytorch 2020
NeRF--: Neural Radiance Fields Without Known Camera Parameters arXiv:2102.07064 [cs] nerfmm, improved-nerfmm 2021
NeRF++: Analyzing and Improving Neural Radiance Fields arXiv:2010.07492 [cs] nerfplusplus 2020
FastNeRF: High-Fidelity Neural Rendering at 200FPS arXiv:2103.10380 [cs] 2021
KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs ICCV 2021
Plenoxels: Radiance Fields without Neural Networks arXiv:2112.05131 [cs] svox2 2021
Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs arXiv:2112.10703 [cs] mega-nerf 2021
❤️ Neural Sparse Voxel Fields arXiv:2007.11571 [cs] NSVF 2021
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields ICCV 2021
❤️ Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields arXiv:2111.12077 [cs.CV] 2021
❤️ Neural Actor: Neural Free-View Synthesis of Human Actors with Pose Control arXiv:2106.02019 [cs] 2022
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding instant-ngp
❤️ Point-NeRF: Point-Based Neural Radiance Fields arXiv:2201.08845 [cs] pointnerf 2022
MoFaNeRF: Morphable Facial Neural Radiance Field arXiv:2112.02308 [cs] 2021
Object-Centric Neural Scene Rendering 2020
Semantic View Synthesis 2020
NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild 2021
MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis arXiv:2103.14910 [cs] 2021
CodeNeRF: Disentangled Neural Radiance Fields for Object Categories ICCV code-nerf 2021
NeRF-SR: High-Quality Neural Radiance Fields Using Super-Sampling arXiv:2112.01759 [cs] 2021
❤️ TensoRF: Tensorial Radiance Fields arXiv:2203.09517 [cs] TensoRF 2022
Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields arXiv:2203.10821 [cs] 2022
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields arXiv:2112.05139 [cs] 2022
BARF: Bundle-Adjusting Neural Radiance Fields arXiv:2104.06405 [cs] 2021
Unified Implicit Neural Stylization arXiv:2204.01943 [cs] 2022
SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image arXiv:2204.00928 [cs] 2022
NeRF-Editing: Geometry Editing of Neural Radiance Fields CVPR NeRF-Editing 2022
PixelNeRF: Neural Radiance Fields from One or Few Images CVPR pixel-nerf 2021
Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields CVPR refnerf 2022

3D inversion

Title Venue Code Year Cite
Unsupervised 3D Shape Completion through GAN Inversion CVPR 2021
3D GAN Inversion for Controllable Portrait Image Animation ArXiv:2203.13441 [Cs] arXiv. 2022
Pix2NeRF: Unsupervised Conditional $\pi$-GAN for Single Image to Neural Radiance Fields Translation ArXiv:2202.13162 [Cs] arXiv. 2022
[Monocular 3D Object Reconstruction with GAN Inversion] ECCV 2022
[INeRF: Inverting Neural Radiance Fields for Pose Estimation] IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) inerf 2021
Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion arXiv:2211.11674 nerf-from-image 2022

Dynamic

Title Venue Code Year Cite
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes arXiv:2011.13084 [cs] Neural-Scene-Flow-Fields 2021
D-NeRF: Neural Radiance Fields for Dynamic Scenes arXiv:2011.13961 [cs] D-NeRF 2020
Dynamic View Synthesis from Dynamic Monocular Video arXiv:2105.06468 [cs] DynamicNeRF 2021
❤️ HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields arXiv:2106.13228 [cs] hypernerf 2021
Neural Radiance Flow for 4D View Synthesis and Video Processing 2020
❤️ Animatable Neural Implicit Surfaces for Creating Avatars from Videos arXiv:2203.08133 [cs] 2022

Voice

Hand

Hair

Loose garment

Title Venue Code Year Cite
✔️ Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks SIGGRAPH VirtualBones 2022
TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style CVPR TailorNet_dataset arXiv. 2020
Learning Implicit Templates for Point-Based Clothed Human Modeling ECCV 2022
3D Clothed Human Reconstruction in the Wild ECCV ClothWild_RELEASE 2022
❤️ TightCap: 3D Human Shape Capture with Clothing Tightness Field ACM Transactions on Graphics TightCap 2021
ARCH: Animatable Reconstruction of Clothed Humans CVPR ARCH 2020

Rigging

Title Venue Code Year Cite
❤️ [Learning Skeletal Articulations with Neural Blend Shapes] ACM Transactions on Graphics neural-blend-shapes 2021

Anime Body

Title Venue Code Year
Collaborative Neural Rendering Using Anime Character Sheets ArXiv:2207.05378 [Cs] CoNR arXiv. 2022

Body

Title Venue Code Year
SMPL: A Skinned Multi-Person Linear Model ACM Trans. Graphics (Proc. SIGGRAPH Asia) 2015
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image CVPR [SMPL-X] 2019
AMASS: Archive of Motion Capture as Surface Shapes ICCV AMASS 2019
✔️ SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes ICCV 2021
✔️ Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies ICCV animatable_nerf 2021
Neural Actor: Neural Free-View Synthesis of Human Actors with Pose Control SIGGRAPH Asia 2021
✔️ Animatable Neural Radiance Fields from Monocular RGB Videos ArXiv:2106.13629 [Cs] Anim-NeRF arXiv. 2021
VIBE: Video Inference for Human Body Pose and Shape Estimation CVPR VIBE arXiv. 2020
✔️ A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose NeurIPS A-NeRF arXiv. 2021
HumanNeRF: Free-Viewpoint Rendering of Moving People from Monocular Video CVPR humannerf 2022
❤️ The Power of Points for Modeling Humans in Clothing ICCV POP 2021
❤️ Neural Point-Based Shape Modeling of Humans in Challenging Clothing International Conference on 3D Vision (3DV) SkiRT 2022
StylePeople: A Generative Model of Fullbody Human Avatars arXiv:2104.08363 [cs] 2021
NPMs: Neural Parametric Models for 3D Deformable Shapes arXiv:2104.00702 [cs] 2021
❤️ ICON: Implicit Clothed Humans Obtained from Normals arXiv:2112.09127 [cs] ICON 2022
❤️ GDNA: Towards Generative Detailed Neural Avatars CVPR 2022
SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks CVPR 2021
NeuralAnnot: Neural Annotator for 3D Human Mesh Training Sets arXiv:2011.11232 [cs] 2022
❤️ PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop ICCV 2021
❤️ Structured Local Radiance Fields for Human Avatar Modeling CVPR arXiv. 2022
✔️ SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video CVPR SelfRecon arXiv. 2022
[ARAH: Animatable Volume Rendering of Articulated Human SDFs] ECCV arah 2022
Neural Actor: Neural Free-View Synthesis of Human Actors with Pose Control SIGGRAPH Asia Neural_Actor_Main_Code arXiv. 2021
❤️ Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis arXiv:2204.11798 [cs] gnr 2022
❤️ NeuMan: Neural Human Radiance Field from a Single Video ECCV ml-neuman 2022
Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis CVPR surface-aligned-nerf arXiv. 2022
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling ECCV LoRD 2022
TAVA: Template-Free Animatable Volumetric Actors ECCV tava 2022
Fast-SNARF: A Fast Deformer for Articulated Neural Fields fast-snarf 2022
InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds arXiv:2212.10550 InstantAvatar 2022

Body Generation

Title Venue Code Year
DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations CVPR DeepFashion 2016
Text2Human: Text-Driven Controllable Human Image Generation ACM Transactions on Graphics (TOG) Text2Human 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation arXiv:2204.11823 [cs] 2022
✔️ 3D-Aware Semantic-Guided Generative Model for Human Synthesis arXiv:2112.01422 [cs] 2021
❤️ InsetGAN for Full-Body Image Generation arXiv:2203.07293 [cs] 2022
Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis ICCV impersonator 2019
✔️ SMPLpix: Neural Avatars from 3D Human Models Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision smplpix arXiv. 2021
Neural Articulated Radiance Field ICCV 2021
Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations ECCV ENARF-GAN 2022
Generative Neural Articulated Radiance Fields ArXiv:2206.14314 [Cs] gnarf arXiv. 2022
AvatarGen: A 3D Generative Model for Animatable Human Avatars ArXiv:2208.00561 [Cs] AvatarGen arXiv. 2022
EVA3D: Compositional 3D Human Generation from 2D Image Collections arXiv:2210.04888 2022

Body from video

Title Venue Code Year
SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video arXiv:2201.12792 [cs] 2022

3DMM Face

Title Venue Code Year
Neural Head Reenactment with Latent Pose Descriptors CVPR latent-pose-reenactment 2020
Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry arXiv:2110.09772 [cs] 2021
REALY: Rethinking the Evaluation of 3D Face Reconstruction ECCV REALY 2022

3D FACE Avatars

Title Venue Code Year
A Morphable Model for the Synthesis of 3D Faces Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques SIGGRAPH ’99, USA: ACM Press/Addison-Wesley Publishing Co. 1999
Learning a Model of Facial Shape and Expression from 4D Scans ACM Transactions on Graphics [FLAME] 2017
❤️ FLAME-in-NeRF : Neural Control of Radiance Fields for Free View Face Animation arXiv:2108.04913 [cs] 2021
Learning a Model of Facial Shape and Expression from 4D Scans ACM Transactions on Graphics 2017
❤️ EMOCA: Emotion Driven Monocular Face Capture and Animation CVPR emoca 2022
FaceVerse: A Fine-Grained and Detail-Controllable 3D Face Morphable Model from a Hybrid Dataset CVPR 2022
I M Avatar: Implicit Morphable Head Avatars from Videos CVPR IMavatar 2022
✔️ Neural Head Avatars from Monocular RGB Videos arXiv:2112.01554 [cs] neural-head-avatars 2022
PVA: Pixel-Aligned Volumetric Avatars arXiv:2101.02697 [cs] 2021
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis arXiv:2103.11078 [cs] 2021
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation arXiv:2201.07786 [cs, eess] 2022
HeadGAN: One-Shot Neural Head Synthesis and Editing arXiv:2012.08261 [cs] 2021
KeypointNeRF: Generalizing Image-Based Volumetric Avatars Using Relative Spatial Encoding of Keypoints arXiv:2205.04992 [cs] 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set ArXiv:1903.08527 [Cs] Deep3DFaceRecon_pytorch arXiv. 2020

Stylization

Title Venue Code Year
Unified Implicit Neural Stylization ECCV arXiv. 2022
ARF: Artistic Radiance Fields ECCV ARF-svox2 arXiv. 2022
UPST-NeRF: Universal Photorealistic Style Transfer of Neural Radiance Fields for 3D Scene arXiv:2208.07059 UPST-NeRF 2022

Face Style

Title Venue Code Year
Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer arXiv:2203.13248 [cs] DualStyleGAN 2022
Stitch It in Time: GAN-Based Facial Editing of Real Videos arXiv. STIT 2022
Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN ArXiv:2204.14079 [Cs] FixNoise arXiv. 2022
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment ECCV AnimeCeleb arXiv. 2022
DCT-Net: Domain-Calibrated Translation for Portrait Stylization ACM Transactions on Graphics DCT-Net 2022
VToonify: Controllable High-Resolution Portrait Video Style Transfer ACM Transactions on Graphics (TOG) VToonify n.d.
BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation NeurIPS BlendGAN 2021
Unpaired Cartoon Image Synthesis via Gated Cycle Mapping CVPR 2022

Face Animation

Title Venue Code Year
Thin-Plate Spline Motion Model for Image Animation CVPR 2022
Depth-Aware Generative Adversarial Network for Talking Head Video Generation CVPR DaGAN arXiv. 2022

Renderer & Regularization

Title Venue Code Year
Implicit Geometric Regularization for Learning Shapes ICML [Eikonal] 2020
Neural 3D Scene Reconstruction with the Manhattan-World Assumption CVPR manhattan_sdf 2022
❤️ Differentiable Signed Distance Function Rendering Transactions on Graphics (Proceedings of SIGGRAPH) sdf 2022
NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-View Reconstruction NeuS 2021
SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data ECCV snes 2022
❤️ Volume Rendering of Neural Implicit Surfaces arXiv:2106.12052 [cs] volsdf 2021
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance NeurIPS idr 2020
✔️ Multi-View Mesh Reconstruction With Neural Deferred Shading CVPR neural-deferred-shading 2022
❤️ IRON: Inverse Rendering by Optimizing Neural SDFs and Materials from Photometric Images CVPR IRON arXiv. 2022
✔️ UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction ICCV unisurf 2021
MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction ArXiv:2206.00665 2022
Direct Voxel Grid Optimization: Super-Fast Convergence for Radiance Fields Reconstruction CVPR DirectVoxGO arXiv. 2022
Improved Direct Voxel Grid Optimization for Radiance Fields Reconstruction ArXiv:2206.05085 [Cs] arXiv. 2022
Improved Surface Reconstruction Using High-Frequency Details ArXiv:2206.07850 [Cs] arXiv. 2022
InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering CVPR InfoNeRF arXiv. 2022
Improving Neural Implicit Surfaces Geometry with Patch Warping CVPR NeuralWarp arXiv. 2022
SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views ECCV SparseNeuS arXiv. 2022
[NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing] ECCV NeuMesh 2022
Neural Density-Distance Fields ECCV neddf arXiv. 2022
Neural 3D Reconstruction in the Wild SIGGRAPH NeuralRecon-W 2022
❤️ KeypointNeRF: Generalizing Image-Based Volumetric Avatars Using Relative Spatial Encoding of Keypoints arXiv:2205.04992 [cs] KeypointNeRF 2022
GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction International Conference on 3D Vision (3DV) go-surf 2022

Material and lighting

Title Venue Code Year
NeILF: Neural Incident Light Field for Physically-Based Material Estimation ECCV neilf arXiv. 2022
[NeRF for Outdoor Scene Relighting] ECCV NeRF-OSR 2022

Motion

Title Venue Code Year
GANimator: Neural Motion Synthesis from a Single Sequence ACM Transactions on Graphics (TOG) ganimator 2022
[Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects] CVPR watch-it-move 2022
Learn to Dance with AIST++: Music Conditioned 3D Dance Generation ICCV 2021
Talking Head(?) Anime from a Single Image 3: Now the Body Too talking-head-anime 2022
PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time ACM Transactions on Graphics 2020
The Wanderings of Odysseus in 3D Scenes CVPR GAMMA arXiv. 2022
Adversarial Parametric Pose Prior CVPR adv_param_pose_prior arXiv. 2022
AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars SIGGRAPH AvatarCLIP 2022
[SOMA: Solving Optical Marker-Based MoCap Automatically] ICCV soma 2021
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model arXiv:2208.15001 MotionDiffuse 2022
TEACH: Temporal Action Composition for 3D Humans International Conference on 3D Vision (3DV) teach arXiv. 2022
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts ECCV TM2T 2022

Shape generation

Title Venue Code Year
Learning Implicit Fields for Generative Shape Modeling arXiv:1812.02822 [cs] 2019

SMPL estimation

Title Venue Code Year
End-to-End Recovery of Human Shape and Pose CVPR [hmr] arXiv. 2018
VIBE: Video Inference for Human Body Pose and Shape Estimation CVPR VIBE arXiv. 2020
TransPose: Real-Time 3D Human Translation and Pose Estimation with Six Inertial Sensors ACM Transactions on Graphics TransPose 2021
Monocular Expressive Body Regression through Body-Driven Attention European Conference on Computer Vision (ECCV) expose 2020
Human Mesh Recovery from Multiple Shots CVPR multishot arXiv. 2022
❤️ Learned Vertex Descent: A New Direction for 3D Human Model Fitting ECCV LVD arXiv. 2022
DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation ECCV DeciWatch arXiv. 2022
PARE: Part Attention Regressor for 3D Human Body Estimation ICCV PARE arXiv. 2021
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers ECCV FastMETRO 2022

Segmentation

Title Venue Code Year
Real-Time High-Resolution Background Matting arXiv:2012.07810 BackgroundMattingV2 2020
Robust High-Resolution Video Matting with Temporal Guidance ArXiv:2108.11515 [Cs] RobustVideoMatting arXiv. 2021

Datasets

Title Venue Code Year
Structured Local Radiance Fields for Human Avatar Modeling CVPR THUman4.0-Dataset 2022
Multiface: A Dataset for Neural Face Rendering ArXiv:2207.11243 [Cs.CV] multiface 2022
ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations CVPR ImFace 2022

FLAME estimation

Title Venue Code Year
Towards Metrical Reconstruction of Human Faces ECCV MICA arXiv. 2022

Dog estimation

Title Venue Code Year
[BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information] CVPR barc_release 2022

Panoptic

Title Venue Code Year
Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation ArXiv:2203.15224 [Cs] PanopticNeRF arXiv. 2022

SDF

Title Venue Code Year
DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation arXiv:1901.05103 [cs] DeepSDF 2019
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling NeurIPS 2016
Occupancy Networks: Learning 3D Reconstruction in Function Space arXiv:1812.03828 [cs] 2019
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization arXiv:1905.05172 [cs] 2019
Deep Meta Functionals for Shape Representation arXiv:1908.06277 [cs] 2019

3D

Title Venue Code Year
Escaping Plato’s Cave: 3D Shape From Adversarial Rendering ICCV 2019
StyleRig: Rigging StyleGAN for 3D Control over Portrait Images arXiv:2004.00121 [cs] 2020
Exemplar-Based 3D Portrait Stylization arXiv:2104.14559 [cs] github 2021
❤️ Landmark Detection and 3D Face Reconstruction for Caricature Using a Nonlinear Parametric Model arXiv:2004.09190 [cs] CaricatureFace 2021
SofGAN: A Portrait Image Generator with Dynamic Styling arXiv:2007.03780 [cs] sofgan 2021
❤️ FreeStyleGAN: Free-View Editable Portrait Rendering with the Camera Manifold arXiv:2109.09378 [cs] 2021
PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering ICCV PIRender 2021

Point Cloud

Title Venue Code Year
Point-Based Modeling of Human Clothing ICCV 2021
ADOP: Approximate Differentiable One-Pixel Point Rendering arXiv:2110.06635 [cs] 2021

Stylization

Title Venue Code Year
Learning to Stylize Novel Views arXiv:2105.13509 [cs] stylescene 2021

Datasets

Title Venue Code Year
Common Objects in 3D: Large-Scale Learning and Evaluation of Real-Life 3D Category Reconstruction ICCV 2021
A 3D Face Model for Pose and Illumination Invariant Face Recognition IEEE International Conference on Advanced Video and Signal Based Surveillance BFM 2009
SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild arXiv:1712.01261 [cs] 2018

3D-aware image synthesis (ref)

Title Venue Code Year
Visual Object Networks: Image Generation with Disentangled 3D Representation arXiv:1812.02725 [cs, stat] 2018
Escaping Plato’s Cave: 3D Shape From Adversarial Rendering ICCV 2019
HoloGAN: Unsupervised Learning of 3D Representations from Natural Images ICCV 2019

Face

Tools

Edit

Title Venue Code Year
FaceEraser: Removing Facial Parts for Augmented Reality arXiv:2109.10760 [cs] 2021
DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing arXiv:2109.10737 [cs] 2021
❤️ StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators arXiv:2108.00946 [cs] 2021
Beholder-GAN: Generation and Beautification of Facial Images with Conditioning on Their Beauty Level arXiv:1902.02593 [cs] 2019
Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks arXiv:2110.08398 [cs] 2021
Fine-Grained Control of Artistic Styles in Image Generation arXiv:2110.10278 [cs] 2021

Anime Face

Title Venue Code Year
AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation arXiv:2102.12593 [cs] 2021
[AnimeGAN: A Novel Lightweight GAN for Photo Animation] AnimeGANv2 2020
❤️ Learning to Cartoonize Using White-Box Cartoon Representations CVPR White-box-Cartoonization 2020
Generative Adversarial Networks for Photo to Hayao Miyazaki Style Cartoons arXiv:2005.07702 [cs, eess] 2020

3DMM

Title Venue Code Year
A Morphable Model for the Synthesis of 3D Faces Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques [3DMM] SIGGRAPH ’99, USA: ACM Press/Addison-Wesley Publishing Co. 1999

Face

Title Venue Code Year
SketchHairSalon: Deep Sketch-Based Hair Image Synthesis arXiv:2109.07874 [cs] 2021

Face Alignment

Title Venue Code Year
Face Alignment Across Large Poses: A 3D Solution IEEE Transactions on Pattern Analysis and Machine Intelligence 2019

Face Recognition

Title Venue Code Year
High-Fidelity Pose and Expression Normalization for Face Recognition in the Wild 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015

Face swapping

3D

Title Venue Code Year
Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild arXiv:1911.11130 [cs] unsup3d 2020
Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs arXiv:2011.00844 [cs] GAN2Shape 2021
A Geometric Analysis of Deep Generative Image Models and Its Applications ICLR 2021
Lifting 2D StyleGAN for 3D-Aware Face Generation CVPR LiftedGAN 2021
Image GANs Meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering arXiv:2010.09125 [cs] 2021
Neural 3D Mesh Renderer CVPR 2018
Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction arXiv:2105.07474 [cs] 2021
Inverting Generative Adversarial Renderer for Face Reconstruction CVPR StyleRenderer 2021
Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection arXiv:2106.07852 [cs] 2021
Subdivision-Based Mesh Convolution Networks arXiv:2106.02285 [cs] 2021
Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection CVPR 2021
To Fit or Not to Fit: Model-Based Face Reconstruction and Occlusion Segmentation from Weak Supervision arXiv:2106.09614 [cs] 2021
Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks arXiv:2106.13041 [cs, eess, stat] 2021
DOVE: Learning Deformable 3D Objects by Watching Videos arXiv:2107.10844 [cs] 2021
De-Rendering the World’s Revolutionary Artefacts CVPR 2021
Learning Generative Models of Textured 3D Meshes from Real-World Images ICCV 2021
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images ICCV 2021

DA

Title Venue Code Year
Semi-Supervised Domain Adaptation via Adaptive and Progressive Feature Alignment arXiv:2106.02845 [cs] 2021
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation arXiv:2101.10979 [cs] 2021

Data

Title Venue Code Year
Semi-Supervised Active Learning with Temporal Output Discrepancy ICCV 2021
❤️ Mean Teachers Are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-Supervised Deep Learning Results NeurIPS 2017
When Deep Learners Change Their Mind: Learning Dynamics for Active Learning arXiv:2107.14707 [cs] 2021
On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models arXiv:2108.00114 [cs] 2021
StyleAugment: Learning Texture De-Biased Representations by Style Augmentation without Pre-Defined Textures arXiv:2108.10549 [cs] 2021
Multi-Task Self-Training for Learning General Representations ICCV 2021
OOWL500: Overcoming Dataset Collection Bias in the Wild arXiv:2108.10992 [cs] 2021
Ghost Loss to Question the Reliability of Training Data IEEE Access 2020
Revisiting 3D ResNets for Video Recognition arXiv:2109.01696 [cs, eess] 2021
❤️ Revisiting ResNets: Improved Training and Scaling Strategies arXiv:2103.07579 [cs] 2021
Learning Fast Sample Re-Weighting Without Reward Data ICCV 2021
How Important Is Importance Sampling for Deep Budgeted Training? arXiv:2110.14283 [cs] 2021

CNN & BN

Light architecture

Title Venue Code Year
Network Augmentation for Tiny Deep Learning arXiv:2110.08890 [cs] 2021
Non-Deep Networks arXiv:2110.07641 [cs] 2021
When to Prune? A Policy towards Early Structural Pruning arXiv:2110.12007 [cs] 2021
❤️ ConformalLayers: A Non-Linear Sequential Neural Network with Associative Layers arXiv:2110.12108 [cs] 2021
CHIP: CHannel Independence-Based Pruning for Compact Neural Networks arXiv:2110.13981 [cs] 2021
Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training arXiv:2102.02887 [cs] 2021

Antialiased CNNs

Title Venue Code Year
Making Convolutional Networks Shift-Invariant Again arXiv:1904.11486 [cs] 2019
Group Equivariant Convolutional Networks ICML arXiv. 2016
Harmonic Networks: Deep Translation and Rotation Equivariance CVPR arXiv. 2017
Learning Steerable Filters for Rotation Equivariant CNNs CVPR arXiv. 2018

Architecture

Title Venue Code Year
Beyond BatchNorm: Towards a General Understanding of Normalization in Deep Learning arXiv:2106.05956 [cs] 2021
R-Drop: Regularized Dropout for Neural Networks arXiv:2106.14448 [cs] 2021
Switchable Whitening for Deep Representation Learning ICCV 2019
Positional Normalization arXiv:1907.04312 [cs] 2019
On Feature Normalization and Data Augmentation arXiv:2002.11102 [cs, stat] 2021
Channel Equilibrium Networks for Learning Deep Representation arXiv:2003.00214 [cs] 2020
Representative Batch Normalization with Feature Calibration CVPR 2021
EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network arXiv:2105.14447 [cs] 2021
Bias Loss for Mobile Neural Networks arXiv:2107.11170 [cs] 2021
Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks arXiv:2107.10963 [cs] 2021
Log-Polar Space Convolution for Convolutional Neural Networks arXiv:2107.11943 [cs] 2021
Decoupled Dynamic Filter Networks arXiv:2104.14107 [cs] 2021
Spectral Leakage and Rethinking the Kernel Size in CNNs arXiv:2101.10143 [cs] 2021
Learning with Noisy Labels via Sparse Regularization ICCV 2021
❤️ Impact of Aliasing on Generalization in Deep Convolutional Networks ICCV 2021
Orthogonal Over-Parameterized Training CVPR 2021
Multiplying Matrices Without Multiplying ICML 2021
AASeg: Attention Aware Network for Real Time Semantic Segmentation arXiv:2108.04349 [cs, eess] 2021
MicroNet: Improving Image Recognition with Extremely Low FLOPs ICCV 2021
Contextual Convolutional Neural Networks arXiv:2108.07387 [cs] 2021
Torch.Manual_seed(3407) Is All You Need: On the Influence of Random Seeds in Deep Learning Architectures for Computer Vision arXiv:2109.08203 [cs] 2021
KATANA: Simple Post-Training Robustness Using Test Time Augmentations arXiv:2109.08191 [cs] 2021
Global Pooling, More than Meets the Eye: Position Information Is Encoded Channel-Wise in CNNs ICCV 2021
A ConvNet for the 2020s arXiv:2201.03545 [cs] ConvNeXt 2022

Compression

Title Venue Code Year
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance arXiv:2109.06397 [cs] 2021

Detection

Title Venue Code Year
Anchor DETR: Query Design for Transformer-Based Detector arXiv:2109.07107 [cs] 2021
❤️ Detecting Twenty-Thousand Classes Using Image-Level Supervision arXiv:2201.02605 [cs] 2022

Segmentation

Title Venue Code Year
Robust High-Resolution Video Matting with Temporal Guidance arXiv:2108.11515 [cs.CV] 2021

MLP

Title Venue Code Year
ResMLP: Feedforward Networks for Image Classification with Data-Efficient Training arXiv:2105.03404 [cs] 2021
ConvMLP: Hierarchical Convolutional MLPs for Vision arXiv:2109.04454 [cs] 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP arXiv:2108.13002 [cs.CV] 2021
Sparse-MLP: A Fully-MLP Architecture with Conditional Computation arXiv:2109.02008 [cs] 2021
MLP-Mixer: An All-MLP Architecture for Vision 2021
CycleMLP: A MLP-like Architecture for Dense Prediction ICLR 2022

Transformer

Title Venue Code Year
Training Data-Efficient Image Transformers & Distillation through Attention arXiv:2012.12877 [cs] deit 2020
Intriguing Properties of Vision Transformers arXiv:2105.10497 [cs] 2021
CogView: Mastering Text-to-Image Generation via Transformers arXiv:2105.13290 [cs] 2021
An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale arXiv:2010.11929 [cs] 2021
Scaling Vision Transformers arXiv:2106.04560 [cs] 2021
IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers arXiv:2106.12620 [cs] 2021
Rethinking and Improving Relative Position Encoding for Vision Transformer ICCV 2021
Go Wider Instead of Deeper arXiv:2107.11817 [cs] 2021
A Unified Efficient Pyramid Transformer for Semantic Segmentation arXiv:2107.14209 [cs] 2021
❤️ Conditional DETR for Fast Training Convergence ICCV 2021
❤️ Sketch Your Own GAN ICCV 2021
CrossFormer: A Versatile Vision Transformer Based on Cross-Scale Attention arXiv:2108.00154 [cs] 2021
Uformer: A General U-Shaped Transformer for Image Restoration arXiv:2106.03106 [cs] 2021
ConvNets vs. Transformers: Whose Visual Representations Are More Transferable? arXiv:2108.05305 [cs] 2021
Mobile-Former: Bridging MobileNet and Transformer arXiv:2108.05895 [cs] 2021
SOTR: Segmenting Objects with Transformers ICCV 2021
Video Transformer Network arXiv:2102.00719 [cs] 2021
Do Vision Transformers See Like Convolutional Neural Networks? arXiv:2108.08810 [cs, stat] 2021
UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer arXiv:2109.04335 [cs, eess] 2021
$\infty$-Former: Infinite Memory Transformer arXiv:2109.00301 [cs] 2021
PnP-DETR: Towards Efficient Visual Analysis with Transformers ICCV 2021
MobileViT: Light-Weight, General-Purpose, and Mobile-Friendly Vision Transformer arXiv:2110.02178 [cs] 2021
MetaFormer Is Actually What You Need for Vision arXiv:2111.11418 [cs] 2021
Restormer: Efficient Transformer for High-Resolution Image Restoration arXiv:2111.09881 [cs] Restormer 2021
An Empirical Study of Training Self-Supervised Vision Transformers arXiv:2104.02057 [cs] 2021
When Vision Transformers Outperform ResNets without Pre-Training or Strong Data Augmentations arXiv:2106.01548 [cs.CV] 2021
Visual Attention Network arXiv:2202.09741 [cs] 2022

ssl

Title Venue Code Year
Emerging Properties in Self-Supervised Vision Transformers arXiv:2104.14294 [cs] dino 2021
What Is Considered Complete for Visual Recognition? arXiv:2105.13978 [cs] 2021
On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals arXiv:2107.14762 [cs] 2021
❤️ Improving Contrastive Learning by Visualizing Feature Transformation ICCV 2021
Scale Efficiently: Insights from Pre-Training and Fine-Tuning Transformers arXiv:2109.10686 [cs] 2021
FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling arXiv:2110.08263 [cs] 2021
BEiT: BERT Pre-Training of Image Transformers arXiv:2106.08254 [cs] 2021
❤️ Parametric Contrastive Learning ICCV 2021
❤️ ImageNet-21K Pretraining for the Masses NeurIPS ImageNet21K 2021
❤️ ML-Decoder: Scalable and Versatile Classification Head arXiv:2111.12933 [cs] ML_Decoder 2021
Asymmetric Loss For Multi-Label Classification ICCV ASL 2021
Grounded Language-Image Pre-Training arXiv:2112.03857 [cs] 2021

Finetune

Title Venue Code Year
❤️ How Transferable Are Features in Deep Neural Networks? arXiv:1411.1792 [cs] 2014

Positional Encoding

Title Venue Code Year
Positional Encoding as Spatial Inductive Bias in GANs arXiv:2012.05217 [cs] 2020
Mind the Pad -- CNNs Can Develop Blind Spots arXiv:2010.02178 [cs, stat] 2020
❤️ How Much Position Information Do Convolutional Neural Networks Encode? ICLR 2020
On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location CVPR 2020
Rethinking and Improving Relative Position Encoding for Vision Transformer ICCV 2021
A Structured Dictionary Perspective on Implicit Neural Representations arXiv:2112.01917 [cs] 2021

NAS

NAS cls

Title Venue Code Year
Neural Architecture Search with Reinforcement Learning ICLR 2017
Learning Transferable Architectures for Scalable Image Recognition CVPR 2018
Progressive Neural Architecture Search ECCV 2018
Efficient Neural Architecture Search via Parameter Sharing ICML 2018
MnasNet: Platform-Aware Neural Architecture Search for Mobile CVPR 2019
DARTS: Differentiable Architecture Search ICLR 2019

NAS GAN

Title Venue Code Year
AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks IEEE Transactions on Pattern Analysis and Machine Intelligence 2021
GAN Compression: Efficient Architectures for Interactive Conditional GANs CVPR 2020
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search ECCV 2020
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks ICML 2020
A Multi-Objective Architecture Search for Generative Adversarial Networks 2020
AutoGAN: Neural Architecture Search for Generative Adversarial Networks ICCV 2019

Low-level

Super-resolution

Frame Interpolation

Title Venue Code Year
FILM: Frame Interpolation for Large Motion arXiv:2202.04901 [cs] 2022

Denoising

Title Venue Code Year
Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering IEEE Transactions on Image Processing 2007
Towards Flexible Blind JPEG Artifacts Removal arXiv:2109.14573 [cs, eess] FBCNN 2021

Scholar

About

A collection of papers I am interested in.