Awesome-CVPR2023-Low-Level-Vision

A Collection of Papers and Codes in CVPR2023 related to Low-Level Vision

[Completed] If you find some missing papers or typos, feel free to pull issues or requests.

Related collections for low-level vision

Overview

Image Restoration
- Video Restoration
Super Resolution
- Image Super Resolution
- Video Super Resolution
Image Rescaling
Denoising
- Image Denoising
Deblurring
- Image Deblurring
- Video Deblurring
Deraining
Dehazing
HDR Imaging / Multi-Exposure Image Fusion
Frame Interpolation
Image Enhancement
- Low-Light Image Enhancement
Image Harmonization
Image Completion/Inpainting
Image Matting
Image Compression
Image Quality Assessment
Style Transfer
Image Editing
Image Generation/Synthesis/ Image-to-Image Translation
- Video Generation
Others

Image Restoration

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

Paper: https://arxiv.org/abs/2303.00748
Code: https://github.com/ofsoundof/GRL-Image-Restoration
Tags: Transformer

Comprehensive and Delicate: An Efficient Transformer for Image Restoration

Paper: CVPR Open Access Version
Tags: Transformer

Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective

Paper: https://arxiv.org/abs/2303.06859
Code: https://github.com/lixinustc/Casual-IRDIL

Generative Diffusion Prior for Unified Image Restoration and Enhancement

Paper: https://arxiv.org/abs/2304.01247
Code: https://github.com/Fayeben/GenerativeDiffusionPrior

DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

Paper: https://arxiv.org/abs/2303.06885
Tags: Diffusion, Blind Face

Bitstream-Corrupted JPEG Images are Restorable: Two-stage Compensation and Alignment Framework for Image Restoration

Paper: https://arxiv.org/abs/2304.06976
Code: https://github.com/wenyang001/Two-ACIR

All-in-One Image Restoration for Unknown Degradations Using Adaptive Discriminative Filters for Specific Degradations

Paper: CVPR Open Access Version

Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions

Paper: CVPR Open Access Version
Code: https://github.com/zhuyr97/WGWS-Net
Tags: Multiple Adverse Weather

AccelIR: Task-Aware Image Compression for Accelerating Neural Restoration

Paper: CVPR Open Access Version
Tags: Image Compression for Accelerating

Robust Unsupervised StyleGAN Image Restoration

Paper: https://arxiv.org/abs/2302.06733
Tags: StyleGAN

Ingredient-Oriented Multi-Degradation Learning for Image Restoration

Paper: CVPR Open Access Version

Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank

Paper: https://arxiv.org/abs/2303.09101
Code: https://github.com/Huang-ShiRui/Semi-UIR
Tags: Underwater Image Restoration

Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior

Paper: https://arxiv.org/abs/2303.15046
Code: https://github.com/ykdai/BracketFlare
Tags: Reflective Flare Removal

Robust Single Image Reflection Removal Against Adversarial Attacks

Paper: CVPR Open Access Version
Tags: Reflection Removal

ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal

Paper: https://arxiv.org/abs/2212.04711
Code: https://github.com/GuoLanqing/ShadowDiffusion
Tags: Diffusion, Shadow Removal

Document Image Shadow Removal Guided by Color-Aware Background

Paper: CVPR Open Access Version
Code: https://github.com/hyyh1314/BGShadowNet
Tags: Shadow Removal

Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera

Paper: https://arxiv.org/abs/2304.06019
Code: https://github.com/jnjaby/AlignFormer

GamutMLP: A Lightweight MLP for Color Loss Recovery

Paper: https://arxiv.org/abs/2304.11743
Code: https://github.com/hminle/gamut-mlp
Tags: restore wide-gamut color values

ABCD: Arbitrary Bitwise Coefficient for De-Quantization

Paper: CVPR Open Access Version
Code: https://github.com/WooKyoungHan/ABCD
Tags: De-quantization/Bit depth expansion

Visual Recognition-Driven Image Restoration for Multiple Degradation With Intrinsic Semantics Recovery

Paper: CVPR Open Access Version
Tags: Restoration for High-Level Tasks

Parallel Diffusion Models of Operator and Image for Blind Inverse Problems

Paper: https://arxiv.org/abs/2211.10656
Code: https://github.com/BlindDPS/blind-dps
Tags: blind deblurring, and imaging through turbulence

Image Reconstruction

Raw Image Reconstruction with Learned Compact Metadata

Paper: https://arxiv.org/abs/2302.12995
Code: https://github.com/wyf0912/R2LCM

High-resolution image reconstruction with latent diffusion models from human brain activity

Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

Paper: https://arxiv.org/abs/2305.02541

Optimization-Inspired Cross-Attention Transformer for Compressive Sensing

Paper: https://arxiv.org/abs/2304.13986
Code: https://github.com/songjiechong/OCTUF
Tags: Compressive Sensing

Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

Paper: https://arxiv.org/abs/2211.06956
Code: https://github.com/zjc062/mind-vis

Burst Restoration

Burstormer: Burst Image Restoration and Enhancement Transformer

Paper: https://arxiv.org/abs/2304.01194
Code: https://github.com/akshaydudhane16/Burstormer

Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement

Paper: https://arxiv.org/abs/2304.06703

Video Restoration

A Simple Baseline for Video Restoration with Grouped Spatial-temporal Shift

Paper: https://arxiv.org/abs/2206.10810
Code: https://github.com/dasongli1/Shift-Net

HNeRV: A Hybrid Neural Representation for Videos

Paper: https://arxiv.org/abs/2304.02633
Code: https://github.com/haochen-rye/HNeRV

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Paper: https://arxiv.org/abs/2303.08120
Code: https://github.com/ChenyangLEI/All-In-One-Deflicker
Tags: Deflickering

[Back-to-Overview]

Super Resolution

Image Super Resolution

Activating More Pixels in Image Super-Resolution Transformer

Paper: https://arxiv.org/abs/2205.04437
Code: https://github.com/XPixelGroup/HAT
Tags: Transformer

N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution

Paper: https://arxiv.org/abs/2211.11436
Code: https://github.com/rami0205/NGramSwin

Omni Aggregation Networks for Lightweight Image Super-Resolution

Paper: https://arxiv.org/abs/2304.10244
Code: https://github.com/Francis0625/Omni-SR

OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution

Paper: https://arxiv.org/abs/2303.01091
Tags: Arbitrary-Scale SR

Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution

Paper: https://arxiv.org/abs/2303.05156
Tags: Normalizing Flow, Arbitrary-Scale SR

Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution

Paper: https://arxiv.org/abs/2303.16513
Code: https://github.com/jaroslaw1007/CLIT
Tags: Arbitrary-Scale SR, Transformer

Deep Arbitrary-Scale Image Super-Resolution via Scale-Equivariance Pursuit

Paper: CVPR Open Access Version
Code: https://github.com/neuralchen/EQSR
Tags: Arbitrary-Scale SR

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

Paper: https://arxiv.org/abs/2212.04362
Tags: Arbitrary-Scale SR

Super-Resolution Neural Operator

Human Guided Ground-truth Generation for Realistic Image Super-resolution

Paper: https://arxiv.org/abs/2303.13069
Code: https://github.com/ChrisDud0257/PosNegGT

Better "CMOS" Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution

Paper: https://arxiv.org/abs/2304.03542
Tags: Blind

Implicit Diffusion Models for Continuous Super-Resolution

CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large Input

Paper: https://arxiv.org/abs/2304.06454

Spectral Bayesian Uncertainty for Image Super-Resolution

Paper: CVPR Open Access Version

Cross-Guided Optimization of Radiance Fields With Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis

Paper: CVPR Open Access Version

Image Super-Resolution Using T-Tetromino Pixels

Paper: CVPR Open Access Version

Memory-Friendly Scalable Super-Resolution via Rewinding Lottery Ticket Hypothesis

Paper: CVPR Open Access Version

Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution

Paper: CVPR Open Access Version
Code: https://github.com/ECNUSR/ETDS

Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation

Paper: https://arxiv.org/abs/2211.13676
Code: https://github.com/seungho-snu/SROOE

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer

Paper: https://arxiv.org/abs/2302.03453
Code: https://github.com/Fanghua-Yu/OSRT
Tags: Transformer, Omnidirectional SR

B-Spline Texture Coefficients Estimator for Screen Content Image Super-Resolution

Paper: CVPR Open Access Version
Code: https://github.com/ByeongHyunPak/btc

Spatial-Frequency Mutual Learning for Face Super-Resolution

Learning Generative Structure Prior for Blind Text Image Super-resolution

Paper: https://arxiv.org/abs/2303.14726
Code: https://github.com/csxmli2016/MARCONet
Tags: Text SR

Guided Depth Super-Resolution by Deep Anisotropic Diffusion

Paper: https://arxiv.org/abs/2211.11592
Code: https://github.com/prs-eth/Diffusion-Super-Resolution
Tags: Guided Depth SR

Toward Stable, Interpretable, and Lightweight Hyperspectral Super-Resolution

Paper: CVPR Open Access Version
Code: https://github.com/WenjinGuo/DAEM
Tags: Hyperspectral SR

Zero-Shot Dual-Lens Super-Resolution

Probability-based Global Cross-modal Upsampling for Pansharpening

Paper: https://arxiv.org/abs/2303.13659
Code: https://github.com/Zeyu-Zhu/PGCU
Tags: Pansharpening(for remote sensing image)

CutMIB: Boosting Light Field Super-Resolution via Multi-View Image Blending

Paper: CVPR Open Access Version
Code: https://github.com/zeyuxiao1997/CutMIB
Tags: Light Field SR

Quantum Annealing for Single Image Super-Resolution

Paper: https://arxiv.org/abs/2304.08924
Tags: [Workshop]

Bicubic++: Slim, Slimmer, Slimmest -- Designing an Industry-Grade Super-Resolution Network

Paper: https://arxiv.org/abs/2305.02126
Code: https://github.com/aselsan-research-imaging-team/bicubic-plusplus
Tags: [Workshop]

Hybrid Transformer and CNN Attention Network for Stereo Image Super-resolution

Paper: https://arxiv.org/abs/2305.05177
Tags: [Workshop]

Video Super Resolution

Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting

Paper: https://arxiv.org/abs/2303.08331
Code: https://github.com/coulsonlee/STDO-CVPR2023

Structured Sparsity Learning for Efficient Video Super-Resolution

Paper: https://github.com/Zj-BinXia/SSL
Code: https://arxiv.org/abs/2206.07687

Compression-Aware Video Super-Resolution

Paper: CVPR Open Access Version

Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution

Paper: https://arxiv.org/abs/2303.13767
Project: https://vlis2022.github.io/cvpr23/egvsr
Tags: Event

Consistent Direct Time-of-Flight Video Depth Super-Resolution

Paper: https://arxiv.org/abs/2211.08658
Code: https://github.com/facebookresearch/DVSR/
Tags: Depth SR

[Back-to-Overview]

Image Rescaling

HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization

Paper: https://arxiv.org/abs/2304.01064
Code: https://github.com/AbnerVictor/HyperThumbnail

DINN360: Deformable Invertible Neural Network for Latitude-Aware 360deg Image Rescaling

Paper: CVPR Open Access Version
Code: https://github.com/gyc9709/DINN360

[Back-to-Overview]

Denoising

Image Denoising

Masked Image Training for Generalizable Deep Image Denoising

Paper: https://arxiv.org/abs/2303.13132
Code: https://github.com/haoyuc/MaskedDenoising

Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising

Paper: https://arxiv.org/abs/2303.14934
Cdoe: https://github.com/nagejacob/SpatiallyAdaptiveSSID
Tags: Self-Supervised

LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising

Paper: https://arxiv.org/abs/2304.00534
Code: https://github.com/Wang-XIaoDingdd/LGBPN
Tags: Self-Supervised

Real-time Controllable Denoising for Image and Video

Paper: https://arxiv.org/pdf/2303.16425.pdf

Zero-Shot Noise2Noise: Efficient Image Denoising without any Data

Paper: https://arxiv.org/abs/2303.11253
Code: https://colab.research.google.com/drive/1i82nyizTdszyHkaHBuKPbWnTzao8HF9b
Tags: Zero-Shot

Patch-Craft Self-Supervised Training for Correlated Image Denoising

Paper: https://arxiv.org/abs/2211.09919
Tags: Self-Supervised

sRGB Real Noise Synthesizing with Neighboring Correlation-Aware Noise Model

Paper: CVPR Open Access Version
Code: https://github.com/xuan611/sRGB-Real-Noise-Synthesizing
Tags: Real Noise Synthesizing

Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising

Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations

Paper: https://arxiv.org/abs/2303.18139
Tags: 3D

Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising

Paper: CVPR Open Access Version
Code: https://github.com/lustrouselixir/SANet
Tags: Stereo Image

Polarized Color Image Denoising

Paper: CVPR Open Access Version
Code: https://github.com/bandasyou/pcdenoise
Tags: Polarized Color Image

[Back-to-Overview]

Deblurring

Image Deblurring

Structured Kernel Estimation for Photon-Limited Deconvolution

Blur Interpolation Transformer for Real-World Motion from Blur

Paper: https://arxiv.org/abs/2211.11423
Code: https://github.com/zzh-tech/BiT

Neumann Network with Recursive Kernels for Single Image Defocus Deblurring

Paper: CVPR Open Access Version
Code: https://github.com/csZcWu/NRKNet

Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring

Paper: https://arxiv.org/abs/2211.12250
Code: https://github.com/kkkls/FFTformer

Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur

Paper: https://arxiv.org/abs/2304.12652
Code: https://github.com/CVMI-Lab/HybridNeuralRendering

Self-Supervised Non-Uniform Kernel Estimation With Flow-Based Motion Prior for Blind Image Deblurring

Paper: CVPR Open Access Version
Code: https://github.com/Fangzhenxuan/UFPDeblur
Tag: Self-Supervised

Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior

Paper: https://arxiv.org/abs/2210.05361
Code: https://github.com/xl-tang01/UAUDeblur
Tags: Unsupervised

K3DN: Disparity-Aware Kernel Estimation for Dual-Pixel Defocus Deblurring

Paper: CVPR Open Access Version

Self-Supervised Blind Motion Deblurring With Deep Expectation Maximization

Paper: CVPR Open Access Version
Tags: Self-Supervised

HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering

Paper: https://arxiv.org/abs/2304.01686
Code: https://github.com/VinAIResearch/HyperCUT

Video Deblurring

Deep Discriminative Spatial and Temporal Network for Efficient Video Deblurring

Paper: CVPR Open Access Version
Code: https://github.com/xuboming8/DSTNet

[Back-to-Overview]

Deraining

Learning A Sparse Transformer Network for Effective Image Deraining

Paper: https://arxiv.org/abs/2303.11950
Code: https://github.com/cschenxiang/DRSformer

SmartAssign: Learning a Smart Knowledge Assignment Strategy for Deraining and Desnowing

[Back-to-Overview]

Dehazing

RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors

Paper: https://arxiv.org/abs/2304.03994
Code: https://github.com/RQ-Wu/RIDCP

Curricular Contrastive Regularization for Physics-aware Single Image Dehazing

Paper: https://arxiv.org/abs/2303.14218
Code: https://github.com/YuZheng9/C2PNet

Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior

Paper: https://arxiv.org/abs/2303.09757
Code: https://github.com/jiaqixuac/MAP-Net

SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing

Streamlined Global and Local Features Combinator (SGLC) for High Resolution Image Dehazing

Paper: https://arxiv.org/abs/2304.13375
Tags: [Workshop]

[Back-to-Overview]

HDR Imaging / Multi-Exposure Image Fusion

Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models

Paper: https://arxiv.org/abs/2303.13031
Code: https://github.com/AndreGuo/HDRTVDM

SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders

Paper: https://arxiv.org/abs/2304.06914

A Unified HDR Imaging Method with Pixel and Patch Level

Paper: https://arxiv.org/abs/2304.06943

Inverting the Imaging Process by Learning an Implicit Camera Model

Paper: https://arxiv.org/abs/2304.12748
Code: https://github.com/xhuangcv/neucam
Tags: generating all-in-focus photos & HDR imaging

Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset

Paper: CVPR Open Access Version
Code: https://github.com/shuaizhengliu/Joint-HDRDN

HDR Imaging with Spatially Varying Signal-to-Noise Ratios

Paper: https://arxiv.org/abs/2303.17253

1000 FPS HDR Video with a Spike-RGB Hybrid Camera

Paper: CVPR Open Access Version

[Back-to-Overview]

Frame Interpolation

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

Paper: https://arxiv.org/abs/2303.00440
Code: https://github.com/MCG-NJU/EMA-VFI

A Unified Pyramid Recurrent Network for Video Frame Interpolation

Paper: https://arxiv.org/abs/2211.03456
Code: https://github.com/srcn-ivl/UPR-Net

BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Paper: https://arxiv.org/abs/2304.02225
Code: https://github.com/JunHeum/BiFormer

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation

Paper: https://arxiv.org/abs/2304.09790
Code: https://github.com/MCG-NKU/AMT

Exploring Discontinuity for Video Frame Interpolation

Frame Interpolation Transformer and Uncertainty Guidance

Paper: CVPR Open Access Version

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

Paper: https://arxiv.org/abs/2203.10291

Range-Nullspace Video Frame Interpolation With Focalized Motion Estimation

Paper: CVPR Open Access Version

Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields

Paper: CVPR Open Access Version
Code: https://github.com/intelpro/CBMNet
Tags: Event-based

Event-based Blurry Frame Interpolation under Blind Exposure

Paper: CVPR Open Access Version
Code: https://github.com/WarranWeng/EBFI-BE
Tags: Event-based

Event-Based Frame Interpolation with Ad-hoc Deblurring

Paper: https://arxiv.org/abs/2301.05191
Tags: Event-based

Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time

Paper: https://arxiv.org/abs/2303.15043
Code: https://github.com/shangwei5/VIDUE
Tags: Frame Interpolation and Deblurring

[Back-to-Overview]

Image Enhancement

Realistic Saliency Guided Image Enhancement

Paper: CVPR Open Access Version
Code: https://github.com/compphoto/RealisticImageEnhancement

Low-Light Image Enhancement

Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement

Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark

DNF: Decouple and Feedback Network for Seeing in the Dark

Paper: CVPR Open Access Version
Code: https://github.com/Srameo/DNF

You Do Not Need Additional Priors or Regularizers in Retinex-Based Low-Light Image Enhancement

Paper: CVPR Open Access Version

Low-Light Image Enhancement via Structure Modeling and Guidance

Paper: https://arxiv.org/abs/2305.05839

Learning a Simple Low-light Image Enhancer from Paired Low-light Instances

Paper: CVPR Open Access Version
Code: https://github.com/zhenqifu/pairlie

[Back-to-Overview]

Image Harmonization/Composition

LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

Paper: https://arxiv.org/abs/2304.13166

Semi-supervised Parametric Real-world Image Harmonization

Paper: https://arxiv.org/abs/2303.00157
Project: https://kewang0622.github.io/sprih/

PCT-Net: Full Resolution Image Harmonization Using Pixel-Wise Color Transformations

Paper: CVPR Open Access Version
Code: https://github.com/rakutentech/PCT-Net-Image-Harmonization/

ObjectStitch: Object Compositing With Diffusion Model

Paper: CVPR Open Access Version

[Back-to-Overview]

Image Completion/Inpainting

NUWA-LIP: Language-Guided Image Inpainting With Defect-Free VQGAN

Paper: CVPR Open Access Version
Code: https://github.com/kodenii/NUWA-LIP

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

Paper: https://arxiv.org/abs/2212.06909

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model

Paper: https://arxiv.org/abs/2212.05034

Semi-Supervised Video Inpainting with Cycle Consistency Constraints

Paper: https://arxiv.org/abs/2208.06807

Deep Stereo Video Inpainting

Paper: CVPR Open Access Version

[Back-to-Overview]

Image Matting

Referring Image Matting

Paper: https://arxiv.org/abs/2206.05149
Code: https://github.com/JizhiziLi/RIM

Adaptive Human Matting for Dynamic Videos

Paper: https://arxiv.org/abs/2304.06018
Code: https://github.com/microsoft/AdaM

Mask-Guided Matting in the Wild

Paper: CVPR Open Access Version

End-to-End Video Matting With Trimap Propagation

Paper: CVPR Open Access Version
Code: https://github.com/csvt32745/FTP-VM

Ultrahigh Resolution Image/Video Matting With Spatio-Temporal Sparsity

Paper: CVPR Open Access Version

[Back-to-Overview]

Image Compression

Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

Paper: https://arxiv.org/abs/2302.14677

Context-based Trit-Plane Coding for Progressive Image Compression

Paper: https://arxiv.org/abs/2303.05715
Code: https://github.com/seungminjeon-github/CTC

Learned Image Compression with Mixed Transformer-CNN Architectures

Paper: https://arxiv.org/abs/2303.14978
Code: https://github.com/jmliu206/LIC_TCM

NVTC: Nonlinear Vector Transform Coding

Paper: https://arxiv.org/abs/2305.16025
Code: https://github.com/USTC-IMCL/NVTC

Multi-Realism Image Compression with a Conditional Generator

Paper: https://arxiv.org/abs/2212.13824

LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression

Paper: https://arxiv.org/abs/2304.12319

Video Compression

Neural Video Compression with Diverse Contexts

Paper: https://github.com/microsoft/DCVC
Code: https://arxiv.org/abs/2302.14402

Video Compression With Entropy-Constrained Neural Representations

Paper: CVPR Open Access Version

Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression

Paper: CVPR Open Access Version

MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding

Paper: https://arxiv.org/abs/2304.02273

Motion Information Propagation for Neural Video Compression

Paper: CVPR Open Access Version

Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding

Paper: CVPR Open Access Version
Code: https://github.com/nycu-clab/tlzmc-cvpr

HNeRV: A Hybrid Neural Representation for Videos

Paper: https://arxiv.org/abs/2304.02633
Code: https://github.com/haochen-rye/HNeRV

[Back-to-Overview]

Image Quality Assessment

Quality-aware Pre-trained Models for Blind Image Quality Assessment

Paper: https://arxiv.org/abs/2303.00521

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Paper: https://arxiv.org/abs/2303.14968
Code: https://github.com/zwx8981/LIQE

Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method

Paper: https://arxiv.org/abs/2303.15166
Code: https://github.com/Dreemurr-T/BAID

Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild

Paper: https://arxiv.org/abs/2304.00451
Code: https://github.com/avinabsaha/ReIQA

An Image Quality Assessment Dataset for Portraits

Paper: https://arxiv.org/abs/2304.05772
Code: https://github.com/DXOMARK-Research/PIQ2023

MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos

Paper: CVPR Open Access Version
Code: https://github.com/zzc-1998/MD-VQA

CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

Paper:CVPR Open Access Version
Code: https://github.com/fdbtrs/CR-FIQA

SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement

Paper: https://arxiv.org/abs/2305.08408
Tags: [Workshop]

[Back-to-Overview]

Style Transfer

Fix the Noise: Disentangling Source Feature for Controllable Domain Translation

Paper: https://arxiv.org/abs/2303.11545
Code: https://github.com/LeeDongYeun/FixNoise

Neural Preset for Color Style Transfer

Paper: https://arxiv.org/abs/2303.13511
Code: https://github.com/ZHKKKe/NeuralPreset

CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

Paper: https://arxiv.org/abs/2303.17867

StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer

Paper: https://arxiv.org/abs/2304.02744
Project: https://stylegan-salon.github.io/

Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer

Paper: https://arxiv.org/abs/2304.04461
Project: https://kaist-viclab.github.io/old-photo-modernization/

QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity

Paper: https://arxiv.org/abs/2212.10431
Code: https://github.com/siyuhuang/QuantArt

Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer

Paper: https://arxiv.org/abs/2304.11818

Learning Dynamic Style Kernels for Artistic Style Transfer

Paper: https://arxiv.org/abs/2304.00414

Inversion-Based Style Transfer with Diffusion Models

Paper: https://arxiv.org/abs/2211.13203
Code: https://github.com/zyxElsa/InST

[Back-to-Overview]

Image Editing

Imagic: Text-Based Real Image Editing with Diffusion Models

Paper: https://arxiv.org/abs/2210.09276

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2212.04489
Code: https://github.com/zhang-zx/SINE

CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing

Paper: https://arxiv.org/abs/2303.05031
Code: https://github.com/JiauZhang/CoralStyleCLIP

SIEDOB: Semantic Image Editing by Disentangling Object and Background

Paper: https://arxiv.org/abs/2303.13062
Code: https://github.com/WuyangLuo/SIEDOB

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

Paper: https://arxiv.org/abs/2304.06711
Code: https://github.com/adobe-research/diffusion-rig

Paint by Example: Exemplar-based Image Editing with Diffusion Models

Paper: https://arxiv.org/abs/2211.13227
Code: https://github.com/Fantasy-Studio/Paint-by-Example

StyleRes: Transforming the Residuals for Real Image Editing With StyleGAN

Paper: https://arxiv.org/abs/2212.14359
Code: https://github.com/hamzapehlivan/StyleRes

Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

Paper: https://arxiv.org/abs/2211.11448
Code: https://github.com/KumapowerLIU/CLCAE

InstructPix2Pix: Learning to Follow Image Editing Instructions

Paper: https://arxiv.org/abs/2211.09800
Code: https://github.com/timothybrooks/instruct-pix2pix

Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model

Paper: https://arxiv.org/abs/2211.14573

Null-text Inversion for Editing Real Images using Guided Diffusion Models

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

Paper: https://arxiv.org/abs/2303.06285
Code: https://github.com/Yueming6568/DeltaEdit

Text-Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation

Paper: CVPR Open Access Version

EDICT: Exact Diffusion Inversion via Coupled Transformations

Paper: https://arxiv.org/abs/2211.12446
Code: https://github.com/salesforce/EDICT

Video Editing

DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

Paper: https://arxiv.org/abs/2301.06281
Code: https://github.com/Carlyx/DPE

Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding

Shape-aware Text-driven Layered Video Editing

Paper: https://arxiv.org/abs/2301.13173
Project: https://text-video-edit.github.io/#

[Back-to-Overview]

Image Generation/Synthesis / Image-to-Image Translation

Text-to-Image / Text Guided / Multi-Modal

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

Paper: https://arxiv.org/abs/2301.12959
Code: https://github.com/tobran/GALIP

Scaling up GANs for Text-to-Image Synthesis

Paper: https://arxiv.org/abs/2303.05511
Project: https://mingukkang.github.io/GigaGAN/

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Paper: https://arxiv.org/abs/2303.16105

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Paper: https://arxiv.org/abs/2304.01816

Shifted Diffusion for Text-to-image Generation

Paper: https://arxiv.org/abs/2211.15388
Code: https://github.com/drboog/Shifted_Diffusion

ReCo: Region-Controlled Text-to-Image Generation

Paper: https://arxiv.org/abs/2211.15518
Code: https://github.com/microsoft/ReCo

RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation With Natural Prompts

Paper: CVPR Open Access Version
Code: https://github.com/WUSTL-CSPL/RIATIG

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper: https://arxiv.org/abs/2301.07093
Code: https://github.com/gligen/GLIGEN

Multi-Concept Customization of Text-to-Image Diffusion

Paper: https://arxiv.org/abs/2212.04488
Code: https://github.com/adobe-research/custom-diffusion

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model With Knowledge-Enhanced Mixture-of-Denoising-Experts

Paper: CVPR Open Access Version

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Paper: https://arxiv.org/abs/2208.12242
Code: https://github.com/google/dreambooth

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style

Paper: CVPR Open Access Version
Code: https://github.com/Picsart-AI-Research/Specialist-Diffusion

MAGVLT: Masked Generative Vision-and-Language Transformer

Paper: https://arxiv.org/abs/2303.12208

Freestyle Layout-to-Image Synthesis

Paper: https://arxiv.org/abs/2303.14412
Code: https://github.com/essunny310/FreestyleNet

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment

Paper: https://arxiv.org/abs/2303.17490
Project: https://sound2scene.github.io/

Collaborative Diffusion for Multi-Modal Face Generation and Editing

Paper: https://arxiv.org/abs/2304.10530
Code: https://github.com/ziqihuangg/Collaborative-Diffusion

SpaText: Spatio-Textual Representation for Controllable Image Generation

Paper: https://arxiv.org/abs/2211.14305

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Paper: https://arxiv.org/abs/2211.12572
Code: https://github.com/MichalGeyer/plug-and-play

LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

Paper: https://arxiv.org/abs/2208.14889
Code: https://github.com/KU-CVLAB/LANIT

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

Paper: https://arxiv.org/abs/2211.05105
Code: https://github.com/ml-research/safe-latent-diffusion

Image-to-Image / Image Guided

Person Image Synthesis via Denoising Diffusion Model

Paper: https://arxiv.org/abs/2211.12500
Code: https://github.com/ankanbhunia/PIDM

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

Paper: https://arxiv.org/abs/2303.11162

Fine-Grained Face Swapping via Regional GAN Inversion

Paper: https://arxiv.org/abs/2211.14068
Code: https://github.com/e4s2022/e4s

Masked and Adaptive Transformer for Exemplar Based Image Translation

Paper: https://arxiv.org/abs/2303.17123
Code: https://github.com/AiArt-HDU/MATEBIT

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

StyleGene: Crossover and Mutation of Region-Level Facial Genes for Kinship Face Synthesis

Paper: CVPR Open Access Version
Code: https://github.com/CVI-SZU/StyleGene

Unpaired Image-to-Image Translation With Shortest Path Regularization

Paper: CVPR Open Access Version
Code: https://github.com/Mid-Push/santa

BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models

Paper: https://arxiv.org/abs/2205.07680
Code: https://github.com/xuekt98/BBDM

MaskSketch: Unpaired Structure-guided Masked Image Generation

Paper: https://arxiv.org/abs/2302.05496
Code: https://github.com/google-research/masksketch

Others for image generation

AdaptiveMix: Improving GAN Training via Feature Space Shrinkage

Paper: CVPR Open Access Version
Code: https://github.com/WentianZhang-ML/AdaptiveMix

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Paper: https://arxiv.org/abs/2211.09117
Code: https://github.com/LTH14/mage

Regularized Vector Quantization for Tokenized Image Synthesis

Paper: https://arxiv.org/abs/2303.06424

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

Paper: https://arxiv.org/abs/2304.07574
Code: https://github.com/yunqing-me/RICK

Post-training Quantization on Diffusion Models

Paper: https://arxiv.org/abs/2211.15736
Code: https://github.com/42Shawn/PTQ4DM

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

Paper: https://arxiv.org/abs/2303.17189
Code: https://github.com/ZGCTroy/LayoutDiffusion

DiffCollage: Parallel Generation of Large Content with Diffusion Models

Paper: https://arxiv.org/abs/2303.17076
Project: https://research.nvidia.com/labs/dir/diffcollage/

Few-shot Semantic Image Synthesis with Class Affinity Transfer

Paper: https://arxiv.org/abs/2304.02321

NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs

Paper: https://arxiv.org/abs/2304.05866
Code: https://github.com/val-iisc/NoisyTwins

DCFace: Synthetic Face Generation with Dual Condition Diffusion Model

Paper: https://arxiv.org/abs/2304.07060
Code: https://github.com/mk-minchul/dcface

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

Paper: https://arxiv.org/abs/2304.07574
Code: https://github.com/yunqing-me/RICK

Class-Balancing Diffusion Models

Paper: https://arxiv.org/abs/2305.00562

Spider GAN: Leveraging Friendly Neighbors to Accelerate GAN Training

Paper: https://arxiv.org/abs/2305.07613

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

Paper: https://arxiv.org/abs/2303.14157
Code: https://github.com/VinAIResearch/CREPS

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis

Paper: https://arxiv.org/abs/2305.19547
Code: https://github.com/csyxwei/iPOSE

GLeaD: Improving GANs with A Generator-Leading Task

Paper: CVPR Open Access Version
Code: https://github.com/EzioBy/glead

Where Is My Spot? Few-Shot Image Generation via Latent Subspace Optimization

Paper: CVPR Open Access Version
Code: https://github.com/chansey0529/LSO

KD-DLGAN: Data Limited Image Generation via Knowledge Distillation

Paper: CVPR Open Access Version

Private Image Generation With Dual-Purpose Auxiliary Classifier

Paper: CVPR Open Access Version

SceneComposer: Any-Level Semantic Image Synthesis

Paper: https://arxiv.org/abs/2211.11742
Code: https://github.com/zengxianyu/scenec

Exploring Intra-Class Variation Factors With Learnable Cluster Prompts for Semi-Supervised Image Synthesis

Paper: CVPR Open Access Version

Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration

Paper: CVPR Open Access Version
Code: https://github.com/IntellicentAI-Lab/Re-GAN

Discriminator-Cooperated Feature Map Distillation for GAN Compression

Paper: https://arxiv.org/abs/2212.14169
Code: https://github.com/poopit/DCD-official

Wavelet Diffusion Models are fast and scalable Image Generators

Paper: https://arxiv.org/abs/2211.16152
Code: https://github.com/VinAIResearch/WaveDiff

On Distillation of Guided Diffusion Models

Paper: https://arxiv.org/abs/2210.03142

Binary Latent Diffusion

Paper: https://arxiv.org/abs/2304.04820
Code: https://github.com/JiauZhang/binary-latent-diffusion

All are Worth Words: A ViT Backbone for Diffusion Models

Paper: https://arxiv.org/abs/2209.12152
Code: https://github.com/baofff/U-ViT

Towards Practical Plug-and-Play Diffusion Models

Paper: https://arxiv.org/abs/2212.05973
Code: https://github.com/riiid/PPAP

Lookahead Diffusion Probabilistic Models for Refining Mean Estimation

Paper: https://arxiv.org/abs/2304.11312
Code: https://github.com/guoqiang-zhang-x/LA-DPM

Diffusion Probabilistic Model Made Slim

Paper: https://arxiv.org/abs/2211.17106

Self-Guided Diffusion Models

Paper: https://arxiv.org/abs/2210.06462

Video Generation

Conditional Image-to-Video Generation with Latent Flow Diffusion Models

Paper: https://arxiv.org/abs/2303.13744
Code: https://github.com/nihaomiao/CVPR2023_LFDM

Video Probabilistic Diffusion Models in Projected Latent Space

Paper: https://arxiv.org/abs/2302.07685
Code: https://github.com/sihyun-yu/PVDM

Decomposed Diffusion Models for High-Quality Video Generation

Paper: https://arxiv.org/abs/2303.08320

MoStGAN: Video Generation with Temporal Motion Styles

Paper: https://arxiv.org/abs/2304.02777
Code: https://github.com/xiaoqian-shen/MoStGAN

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Paper: https://arxiv.org/abs/2304.08818

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation

Paper: https://arxiv.org/abs/2211.12824
Code: https://github.com/tsujuifu/pytorch_tvc

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Paper: CVPR Open Access Version
Code: https://github.com/researchmm/MM-Diffusion

Dimensionality-Varying Diffusion Process

Paper: https://arxiv.org/abs/2211.16032

[Back-to-Overview]

Others

Perspective Fields for Single Image Camera Calibration

Paper: https://arxiv.org/abs/2212.03239
Code: https://github.com/jinlinyi/PerspectiveFields

DC2: Dual-Camera Defocus Control by Learning to Refocus

Paper: https://arxiv.org/abs/2304.03285
Project: https://defocus-control.github.io/

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

Paper: https://arxiv.org/abs/2212.02499
Code: https://github.com/baaivision/Painter

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

Paper: https://arxiv.org/abs/2211.13319
Code: https://github.com/ubc-vision/Make-A-Story

Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

Paper: https://arxiv.org/abs/2303.10774
Code: https://github.com/mattolson93/cross_gan_auditing

LightPainter: Interactive Portrait Relighting with Freehand Scribble

Paper: https://arxiv.org/abs/2303.12950
Tags: Portrait Relighting

Neural Texture Synthesis with Guided Correspondence

Paper: CVPR Open Access Version
Code: https://github.com/EliotChenKJ/Guided-Correspondence-Loss
Tags: Texture Synthesis

Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

Paper: https://arxiv.org/abs/2304.02828
Code: https://github.com/noagarcia/phase

Large-capacity and Flexible Video Steganography via Invertible Neural Network

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes

Controllable Light Diffusion for Portraits

Paper: https://arxiv.org/abs/2305.04745
Tags: Relighting

Talking Head Generation

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Paper: https://arxiv.org/abs/2303.17480
Code: https://github.com/Sxjdwang/TalkLip

High-Fidelity and Freely Controllable Talking Head Video Generation

Paper: https://arxiv.org/abs/2304.10168
Code: https://github.com/hologerry/PECHead

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

Paper: https://arxiv.org/abs/2212.08062
Code: https://github.com/Meta-Portrait/MetaPortrait

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

Paper: https://arxiv.org/abs/2305.08293
Code: https://github.com/Weizhi-Zhong/IP_LAP

LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook

Paper: CVPR Open Access Version

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

Paper: https://arxiv.org/abs/2305.02572

DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation

Paper: https://arxiv.org/abs/2301.03786
Code: https://github.com/sstzal/DiffTalk

Virtual Try-on

GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning

Paper: https://arxiv.org/abs/2303.13756
Code: https://github.com/xiezhy6/GP-VTON

Linking Garment With Person via Semantically Associated Landmarks for Virtual Try-On

Paper: CVPR Open Access Version
Code: https://modelscope.cn/datasets/damo/SAL-HG/summary

TryOnDiffusion: A Tale of Two UNets

Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhu_TryOnDiffusion_A_Tale_of_Two_UNets_CVPR_2023_paper.html

Handwriting/Font Generation

CF-Font: Content Fusion for Few-shot Font Generation

Paper: https://arxiv.org/abs/2303.14017
Code: https://github.com/wangchi95/CF-Font
Tags: Font Generation

Neural Transformation Fields for Arbitrary-Styled Font Generation

Paper: CVPR Open Access Version
Code: https://github.com/fubinfb/NTF

DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality

Paper: https://arxiv.org/abs/2303.14585
Code: https://github.com/yizhiwang96/deepvecfont-v2

Handwritten Text Generation from Visual Archetypes

Paper: https://arxiv.org/abs/2303.15269
Tags: Handwriting Generation

Disentangling Writer and Character Styles for Handwriting Generation

Paper: https://arxiv.org/abs/2303.14736
Code: https://github.com/dailenson/SDT
Tags: Handwriting Generation

Conditional Text Image Generation With Diffusion Models

Paper: CVPR Open Access Version

Layout Generation

Unifying Layout Generation with a Decoupled Diffusion Model

Paper: https://arxiv.org/abs/2303.05049

Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation

Paper: https://arxiv.org/abs/2303.14377

PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

Paper: https://arxiv.org/abs/2303.08137
Code: https://github.com/CyberAgentAILab/layout-dm

LayoutDM: Transformer-based Diffusion Model for Layout Generation

Paper: https://arxiv.org/abs/2305.02567

Awesome-CVPR2023-Low-Level-Vision

Related collections for low-level vision

Overview

Image Restoration

Image Reconstruction

Burst Restoration

Video Restoration

Super Resolution

Image Super Resolution

Video Super Resolution

Image Rescaling

Denoising

Image Denoising

Deblurring

Image Deblurring

Video Deblurring

Deraining

Dehazing

HDR Imaging / Multi-Exposure Image Fusion

Frame Interpolation

Image Enhancement

Low-Light Image Enhancement

Image Harmonization/Composition

Image Completion/Inpainting

Image Matting

Image Compression

Video Compression

Image Quality Assessment

Style Transfer

Image Editing

Video Editing

Image Generation/Synthesis / Image-to-Image Translation

Text-to-Image / Text Guided / Multi-Modal

Image-to-Image / Image Guided

Others for image generation

Video Generation

Others

Talking Head Generation

Virtual Try-on

Handwriting/Font Generation

Layout Generation

About