HappyAIWalker / ICCV2023-paper-code

ICCV2023论文代码汇总

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ICCV2023-paper-code

持续更新ICCV2023论文、代码等信息,欢迎关注AIWalker。主要聚焦以下几个方向,更多CV/AI资料可添加AIWalker助手【AIWalker-zhushou】获取(可扫描底部二维码)。

Backbone

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Rethinking Mobile Block for Efficient Attention-based Models

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

Scale-Aware Modulation Meet Transformer

Improving Zero-Shot Generalization for CLIP with Synthesized Prompts

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

ShiftNAS: Improving One-shot NAS via Probability Shift

MULLER: Multilayer Laplacian Resizer for Vision

FLatten Transformer: Vision Transformer with Focused Linear Attention

Not All Features Matter:Enhancing Few-shot CLIP with Adaptive Prior

Tuning Pre-trained Model via Moment Probing

Strip-MLP: Efficient Token Interaction for Vision MLP

Adaptive Frequency Filters As Efficient Global Token Mixers

Learning Concise and Descriptive Attributes for Visual Recognition

Detection

FemtoDet: an object detection baseline for energy versus performance tradeoffs

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Large Selective Kernel Network for Remote Sensing Object Detection

DiffusionDet: Diffusion Model for Object Detection

DETRs with Collaborative Hybrid Assignments Training

MIMDet: Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection

Detection Transformer with Stable Matching

Random Boxes Are Open-world Object Detectors

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

Cascade-DETR: Delving into High-Quality Universal Object Detection

Deep Directly-Trained Spiking Neural Networks for Object Detection

COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts

Less is More: Focus Attention for Efficient DETR

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

RecursiveDet: End-to-End Region-based Recursive Object Detection

Segmentation

Segment Anything

SegGPT: Segmenting Everything in Context

VLPart: Going Denser with Open-Vocabulary Part Segmentation

Referring Image Segmentation Using Text Supervision

EfficientViT: Lightweight Multi-Scale Attention for On-Device Semantic Segmentation

A Simple Framework for Open-Vocabulary Segmentation and Detection

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation

Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Exploring Transformers for Open-world Instance Segmentation

Knowledge Distillation

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

DOT: A Distillation-Oriented Trainer

Cumulative Spatial Knowledge Distillation for Vision Transformers

Class-relation Knowledge Distillation for Novel Class Discovery

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Rethinking Data Distillation: Do Not Overlook Calibration

Diffusion

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

Expressive Text-to-Image Generation with Rich Text

Ablating Concepts in Text-to-Image Diffusion Models

Evaluating Data Attribution for Text-to-Image Models

Masked Diffusion Transformer is a Strong Image Synthesizer

  • Paper: TODO
  • Code: TODO

SVDiff: Compact Parameter Space for Diffusion Fine-tuning

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Depth

Neural Video Depth Stabilizer

Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV

MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation

Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching

VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation

Learning Depth Estimation for Transparent and Mirror Surfaces

Restoration

Adaptive Nonlinear Latent Transformation for Conditional Face Editing

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

Diffir: Efficient diffusion model for image restoration

Physics-Driven Turbulence Image Restoration with Stochastic Refinement

Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration

Under-Display Camera Image Restoration with Scattering Effect

From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal

GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild -Paper: https://arxiv.org/pdf/2211.12352.pdf

Super-Resolution

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

SAFMN: Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution

DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution

Dual Aggregation Transformer for Image Super-Resolution

A Benchmark for Chinese-English Scene Text Image Super-resolution

Deblurring

Multi-scale Residual Low-Pass Filter Network for Image Deblurring

  • Paper: TODO
  • Code: TODO

Low-light Image Enhance

Implicit Neural Representation for Cooperative Low-light Image Enhancement

Iterative Prompt Learning for Unsupervised Backlit Image Enhancement

ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

IQA/IAA

Delegate Transformer for Image Color Aesthetics Assessment

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

Other

Fast Full-frame Video Stabilization with Iterative Optimization

Dataset

LPFF: A Portrait Dataset for Face Generators Across Large Poses

AIWalker-小助手

About

ICCV2023论文代码汇总