PRIV-Creation / Awesome-Diffusion-Personalization

A collection of resources on personalization with diffusion models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome Maintenance PR's Welcome


Awesome Diffusion Personalization


We are focusing on how to efficiently learn a concept/object/style based on large diffusion models.

🎉 Feature

  • Unidiffusion: Codebase for diffusion model personalization.
  • Personalization: Learning a concept from few data and generate images containing it.
  • Inversion: Inverting images into latent representation (e.g., text_embedding, latent_code, etc.) which can reconstruct the input image. Then editing methods can be applied to it to manipulate given images.
  • Editing: Editing the latent representation to manipulate the generated images.
  • Parameter-Efficient Fine-Tuning: Inspired by LLM, we can speed up optimization process by various mechanisms.

🌈 UniDiffusion

We are building a Diffusion Training repository UniDiffusion. UniDiffusion is aimed at researchers and users who wish to deeply customize the training of stable diffusion. We hope that this code repository can provide excellent support for future research and application extensions.

⭐ Personalization Methods

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models.
Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye.
arXiv 2023.12. [PDF]

Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models.
Saman Motamed, Danda Pani Paudel, Luc Van Gool.
arXiv 2023.11. [PDF]

CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization.
Ruoyu Zhao, Mingrui Zhu, Shiyin Dong, Nannan Wang, Xinbo Gao.
arXiv 2023.11. [PDF]

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs.
Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani.
arXiv 2023.11. [PDF][Github]

An Image Is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis.
Aishwarya Agarwal, Srikrishna Karanam, Tripti Shukla, Balaji Vasan Srinivasan.
arXiv 2023.11. [PDF]

High-fidelity Person-centric Subject-to-Image Synthesis.
Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin.
arXiv 2023.11. [PDF]

DIFFNAT: Improving Diffusion Image Quality Using Natural Image Statistics.
Aniket Roy, Maiterya Suin, Anshul Shah, Ketul Shah, Jiang Liu, Rama Chellappa.
arXiv 2023.11. [PDF]

A Data Perspective on Enhanced Identity Preservation for Diffusion Personalization.
Xingzhe He, Zhiwen Cao, Nicholas Kolkin, Lantao Yu, Helge Rhodin, Ratheesh Kalarot.
arXiv 2023.11. [PDF]

VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning.
Hong Chen, Xin Wang, Guanning Zeng, Yipeng Zhang, Yuwei Zhou, Feilin Han, Wenwu Zhu.
arXiv 2023.11. [PDF]

Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models.
Hai Wang, Xiaoyu Xiang, Yuchen Fan, Jing-Hao Xue.
arXiv 2023.10. [PDF][Github]

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan.
arXiv 2023.10. [PDF][Github]

An Image Is Worth Multiple Words: Learning Object Level Concepts Using Multi-Concept Prompt Learning.
Chen Jin, Ryutaro Tanno, Amrutha Saseendran, Tom Diethe, Philip Teare.
arXiv 2023.10. [PDF][Github]

SingleInsert: Inserting New Concepts from A Single Image Into Text-to-Image Models for Flexible Editing.
Zijie Wu, Chaohui Yu, Zhen Zhu, Fan Wang, Xiang Bai.
arXiv 2023.10. [PDF][Github]

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou.
arXiv 2023.10. [PDF][Github]

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion.
Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov.
arXiv 2023.10. [PDF][Github]

EasyPhoto: Your Smart AI Photo Generator.
Ziheng Wu, Jiaqi Xu, Xinyi Zou, Kunzhe Huang, Xing Shi, Jun Huang.
arXiv 2023.10. [PDF]

ImagenHub: Standardizing The Evaluation of Conditional Image Generation Models.
Max Ku, Tianle Li, Kai Zhang, Yujie Lu, Xingyu Fu, Wenwen Zhuang, Wenhu Chen.
arXiv 2023.10. [PDF]

DreamCom: Finetuning Text-guided Inpainting Model for Image Composition.
Lingxiao Lu, Bo Zhang, Li Niu.
arXiv 2023.09. [PDF]

RealFill: Reference-Driven Generation for Authentic Image Completion.
Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein.
arXiv 2023.09. [PDF][Github]

MagiCapture: High-Resolution Multi-Concept Portrait Customization.
Junha Hyung, Jaeyo Shin, Jaegul Choo.
arXiv 2023.09. [PDF]

PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models.
Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, Jing Liu, Kang Du, Min Zheng.
arXiv 2023.09. [PDF]

FaceChain: A Playground for Identity-Preserving Portrait Generation.
Yang Liu, Cheng Yu, Lei Shang, Ziheng Wu, Xingjun Wang, Yuze Zhao, Lin Zhu, Chen Cheng, Weitao Chen, Chao Xu, Haoyu Xie, Yuan Yao, Wenmeng Zhou, Yingda Chen, Xuansong Xie, Baigui Sun.
arXiv 2023.08. [PDF]

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models.
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman.
arXiv 2023.07. [PDF][Github]

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models.
Moab Arar, Rinon Gal, Yuval Atzmon, Gal Chechik, Daniel Cohen-Or, Ariel Shamir, Amit H. Bermano.
arXiv 2023.07. [PDF][Github]

AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation.
Yifei Zeng, Yuanxun Lu, Xinya Ji, Yao Yao, Hao Zhu, Xun Cao.
arXiv 2023.06. [PDF][Github]

Controlling Text-to-Image Diffusion by Orthogonal Finetuning.
Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard Schölkopf.
arXiv 2023.06. [PDF][Link)]

Face0: Instantaneously Conditioning A Text-to-Image Model on A Face.
Dani Valevski, Danny Wasserman, Yossi Matias, Yaniv Leviathan.
arXiv 2023.06. [PDF]

Cones 2: Customizable Image Synthesis with Multiple Subjects.
Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao.
arXiv 2023.05. [PDF]

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou.
arXiv 2023.05. [PDF]

Photoswap: Personalized Subject Swapping in Images.
Jing Gu, Yilin Wang, Nanxuan Zhao, Tsu-Jui Fu, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang.
arXiv 2023.05. [PDF]

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing.
Dongxu Li, Junnan Li, Steven C. H. Hoi.
arXiv 2023.05. [PDF]

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention.
Guangxuan Xiao, Tianwei Yin, William T. Freeman, Frédo Durand, Song Han.
arXiv 2023.05. [PDF]

DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation.
Hong Chen, Yipeng Zhang, Xin Wang, Xuguang Duan, Yuwei Zhou, Wenwu Zhu.
arXiv 2023.05. [PDF]

Key-Locked Rank One Editing for Text-to-Image Personalization.
Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon.
SIGGRAPH 2023. [PDF][Link]

Identity Encoder for Personalized Diffusion.
Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia.
arXiv 2023.04. [PDF]

DiffFit: Unlocking Transferability of Large Diffusion Models Via Simple Parameter-Efficient Fine-Tuning.
Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou, Zhaoqiang Liu, Jiawei Li, Zhenguo Li.
arXiv 2023.04. [PDF]

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA.
James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin.
arXiv 2023.04. [PDF][Github]

Gradient-Free Textual Inversion.
Zhengcong Fei, Mingyuan Fan, Junshi Huang.
arXiv 2023.04. [PDF]

Controllable Textual Inversion for Personalized Text-to-Image Generation.
Jianan Yang, Haobo Wang, Yanming Zhang, Ruixuan Xiao, Sai Wu, Gang Chen, Junbo Zhao.
arXiv 2023.04. [PDF][Github]

InstantBooth: Personalized Text-to-Image Generation Without Test-Time Finetuning.
Jing Shi, Wei Xiong, Zhe Lin, Hyun Joon Jung.
arXiv 2023.04. [PDF]

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models.
Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su.
arXiv 2023.04. [PDF]

Subject-driven Text-to-Image Generation Via Apprenticeship Learning.
Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Ruiz, Xuhui Jia, Ming-Wei Chang, William W. Cohen.
NeurIPS 2023. [PDF][Link][Link]

A Closer Look at Parameter-Efficient Tuning in Diffusion Models.
Chendong Xiang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu.
arXiv 2023.03. [PDF][Github]

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning.
Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang.
arXiv 2023.03. [PDF]

P+: Extended Textual Conditioning in Text-to-Image Generation.
Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, Kfir Aberman.
arXiv 2023.03. [PDF]

Cones: Concept Neurons in Diffusion Models for Customized Generation.
Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao.
arXiv 2023.03. [PDF]

ELITE: Encoding Visual Concepts Into Textual Embeddings for Customized Text-to-Image Generation.
Yuxiang Wei, Yabo Zhang, Zhilong Ji, Jinfeng Bai, Lei Zhang, Wangmeng Zuo.
ICCV 2023. [PDF][Github]

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models.
Rinon Gal, Moab Arar, Yuval Atzmon, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or.
arXiv 2023.02. [PDF][Github]

Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics.
Anton Voronov, Mikhail Khoroshikh, Artem Babenko, Max Ryabinin.
NeurIPS 2023. [PDF][Github]

SINE: SINgle Image Editing with Text-to-Image Diffusion Models.
Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren.
arXiv 2022.12. [PDF][Github]

Multi-Concept Customization of Text-to-Image Diffusion.
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu.
arXiv 2022.12. [PDF][Link][Link]

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation Via Positive-Negative Prompt-Tuning.
Ziyi Dong, Pengxu Wei, Liang Lin.
arXiv 2022.11. [PDF]

An Image Is Worth One Word: Personalizing Text-to-Image Generation Using Textual Inversion.
Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or.
arXiv 2022.08. [PDF][Github]

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation.
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman.
CVPR 2023. [PDF][Github]

🏹 Inversion

IterInv: Iterative Inversion for Pixel-Level T2I Models.
Chuanming Tang, Kai Wang, Joost van de Weijer.
NeurIPS 2023. [PDF]

Object-aware Inversion and Reassembly for Image Editing.
Zhen Yang, Dinggang Gui, Wen Wang, Hao Chen, Bohan Zhuang, Chunhua Shen.
arXiv 2023.10. [PDF][Github]

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code.
Xuan Ju, Ailing Zeng, Yuxuan Bian, Shaoteng Liu, Qiang Xu.
arXiv 2023.10. [PDF]

Prompt-tuning Latent Diffusion Models for Inverse Problems.
Hyungjin Chung, Jong Chul Ye, Peyman Milanfar, Mauricio Delbracio.
arXiv 2023.10. [PDF]

KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing.
Jiancheng Huang, Yifan Liu, Jin Qin, Shifeng Chen.
arXiv 2023.09. [PDF]

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion.
Zhihong Pan, Riccardo Gherardi, Xiufeng Xie, Stephen Huang.
ICCV 2023. [PDF]

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models.
Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang.
arXiv 2023.08. [PDF]

Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis Via Stochastic Differential Equations Without Training.
Ximing Xing, Chuang Wang, Haitao Zhou, Zhihao Hu, Chongxuan Li, Dong Xu, Qian Yu.
arXiv 2023.08. [PDF]

Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models.
Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka.
arXiv 2023.05. [PDF]

Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models.
Wenkai Dong, Song Xue, Xiaoyue Duan, Shumin Han.
arXiv 2023.05. [PDF]

Guided Image Synthesis Via Initial Image Editing in Diffusion Model.
Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa.
arXiv 2023.05. [PDF]

An Edit Friendly DDPM Noise Space: Inversion and Manipulations.
Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli.
arXiv 2023.04. [PDF]

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery.
Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein.
arXiv 2023.02. [PDF][Github]

EDICT: Exact Diffusion Inversion Via Coupled Transformations.
Bram Wallace, Akash Gokul, Nikhil Naik.
arXiv 2022.11. [PDF]

Null-text Inversion for Editing Real Images Using Guided Diffusion Models.
Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, Daniel Cohen-Or.
arXiv 2022.11. [PDF]

Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models.
Adham Elarabawy, Harish Kamath, Samuel Denton.
arXiv 2022.11. [PDF]

🎨 Editing

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models.
Rohit Gandikota, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau.
arXiv 2023.11. [PDF]

Emu Edit: Precise Image Editing Via Recognition and Generation Tasks.
Shelly Sheynin, Adam Polyak, Uriel Singer, Yuval Kirstain, Amit Zohar, Oron Ashual, Devi Parikh, Yaniv Taigman.
arXiv 2023.11. [PDF]

On Manipulating Scene Text in The Wild with Diffusion Models.
Joshua Santoso, Christian Simon, Williem Pao.
arXiv 2023.11. [PDF]

Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models.
Tianyi Lu, Xing Zhang, Jiaxi Gu, Hang Xu, Renjing Pei, Songcen Xu, Zuxuan Wu.
arXiv 2023.10. [PDF]

CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation.
Sihan Xu, Ziqiao Ma, Yidong Huang, Honglak Lee, Joyce Chai.
NeurIPS 2023. [PDF]

Object-aware Inversion and Reassembly for Image Editing.
Zhen Yang, Dinggang Gui, Wen Wang, Hao Chen, Bohan Zhuang, Chunhua Shen.
arXiv 2023.10. [PDF][Github]

LOVECon: Text-driven Training-Free Long Video Editing with ControlNet.
Zhenyi Liao, Zhijie Deng.
arXiv 2023.10. [PDF]

DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing.
Yueming Lyu, Kang Zhao, Bo Peng, Yue Jiang, Yingya Zhang, Jing Dong.
arXiv 2023.10. [PDF]

FLATTEN: Optical FLow-guided ATTENtion for Consistent Text-to-video Editing.
Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He.
arXiv 2023.10. [PDF][Github]

EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods.
Samyadeep Basu, Mehrdad Saberi, Shweta Bhardwaj, Atoosa Malemir Chegini, Daniela Massiceti, Maziar Sanjabi, Shell Xu Hu, Soheil Feizi.
arXiv 2023.10. [PDF]

KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing.
Jiancheng Huang, Yifan Liu, Jin Qin, Shifeng Chen.
arXiv 2023.09. [PDF]

Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing.
Kai Wang, Fei Yang, Shiqi Yang, Muhammad Atif Butt, Joost van de Weijer.
arXiv 2023.09. [PDF][Github]

CCEdit: Creative and Controllable Video Editing Via Diffusion Models.
Ruoyu Feng, Wenming Weng, Yanhui Wang, Yuhui Yuan, Jianmin Bao, Chong Luo, Zhibo Chen, Baining Guo.
arXiv 2023.09. [PDF]

Face Aging Via Diffusion-based Editing.
Xiangyi Chen, Stéphane Lathuilière.
arXiv 2023.09. [PDF]

Forgedit: Text Guided Image Editing Via Learning and Forgetting.
Shiwen Zhang, Shuai Xiao, Weilin Huang.
arXiv 2023.09. [PDF][Github]

PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models.
Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, Jing Liu, Kang Du, Min Zheng.
arXiv 2023.09. [PDF]

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion.
Zhihong Pan, Riccardo Gherardi, Xiufeng Xie, Stephen Huang.
ICCV 2023. [PDF]

Iterative Multi-granular Image Editing Using Diffusion Models.
K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan.
arXiv 2023.09. [PDF]

Zero-shot Inversion Process for Image Attribute Editing with Diffusion Models.
Zhanbo Feng, Zenan Ling, Ci Gong, Feng Zhou, Jie Li, Robert C. Qiu.
arXiv 2023.08. [PDF]

MagicEdit: High-Fidelity and Temporally Coherent Video Editing.
Jun Hao Liew, Hanshu Yan, Jianfeng Zhang, Zhongcong Xu, Jiashi Feng.
arXiv 2023.08. [PDF][Github]

ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation.
Yasheng Sun, Yifan Yang, Houwen Peng, Yifei Shen, Yuqing Yang, Han Hu, Lili Qiu, Hideki Koike.
arXiv 2023.08. [PDF]

SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation.
Shikun Sun, Longhui Wei, Junliang Xing, Jia Jia, Qi Tian.
arXiv 2023.08. [PDF]

Not All Steps Are Created Equal: Selective Diffusion Distillation for Image Manipulation.
Luozhou Wang, Shuai Yang, Shu Liu, Ying-cong Chen.
arXiv 2023.07. [PDF]

Identity-Preserving Aging of Face Images Via Latent Diffusion Models.
Sudipta Banerjee, Govind Mittal, Ameya Joshi, Chinmay Hegde, Nasir Memon.
arXiv 2023.07. [PDF]

TokenFlow: Consistent Diffusion Features for Consistent Video Editing.
Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel.
arXiv 2023.07. [PDF]

Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models.
Hyeonho Jeong, Jong Chul Ye.
ICLR 2024. [PDF]

Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models.
Geon Yeong Park, Jeongsol Kim, Beomsu Kim, Sang Wan Lee, Jong Chul Ye.
NeurIPS 2023. [PDF]

Adaptive Nonlinear Latent Transformation for Conditional Face Editing.
Zhizhong Huang, Siteng Ma, Junping Zhang, Hongming Shan.
ICCV 2023. [PDF]

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models.
Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang.
arXiv 2023.07. [PDF]

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance.
Linoy Tsaban, Apolinário Passos.
arXiv 2023.07. [PDF]

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques.
Sunwoo Kim, Wooseok Jang, Hyunsu Kim, Junho Kim, Yunjey Choi, Seungryong Kim, Gayeong Lee.
arXiv 2023.06. [PDF]

Improving Diffusion-based Image Translation Using Asymmetric Gradient Guidance.
Gihyun Kwon, Jong Chul Ye.
arXiv 2023.06. [PDF]

PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing.
Wenjing Huang, Shikui Tu, Lei Xu.
arXiv 2023.06. [PDF]

Continuous Layout Editing of Single Images with Diffusion Models.
Zhiyuan Zhang, Zhitong Huang, Jing Liao.
arXiv 2023.06. [PDF]

Paste, Inpaint and Harmonize Via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model.
Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa.
arXiv 2023.06. [PDF]

Conditional Score Guidance for Text-Driven Image-to-Image Translation.
Hyunsoo Lee, Minsoo Kang, Bohyung Han.
NeurIPS2023. [PDF]

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions.
Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka.
arXiv 2023.05. [PDF][Github]

FISEdit: Accelerating Text-to-image Editing Via Cache-enabled Sparse Diffusion Inference.
Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui.
arXiv 2023.05. [PDF]

Towards Consistent Video Editing with Text-to-Image Diffusion Models.
Zicheng Zhang, Bonan Li, Xuecheng Nie, Congying Han, Tiande Guo, Luoqi Liu.
arXiv 2023.05. [PDF]

Text-to-image Editing by Image Information Removal.
Zhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. Plummer.
arXiv 2023.05. [PDF]

Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models.
Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka.
arXiv 2023.05. [PDF]

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models.
Jooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim, Sungroh Yoon.
CVPR 2023. [PDF]

ChatFace: Chat-Guided Real Face Editing Via Diffusion Latent Space Manipulation.
Dongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, Li Yuan.
arXiv 2023.05. [PDF]

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing.
Dongxu Li, Junnan Li, Steven C. H. Hoi.
arXiv 2023.05. [PDF]

DiffUTE: Universal Text Editing Diffusion Model.
Haoxing Chen, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Xing Zheng, Yaohui Li, Changhua Meng, Huijia Zhu, Weiqiang Wang.
arXiv 2023.05. [PDF]

Guided Image Synthesis Via Initial Image Editing in Diffusion Model.
Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa.
arXiv 2023.05. [PDF]

Collaborative Diffusion for Multi-Modal Face Generation and Editing.
Ziqi Huang, Kelvin C. K. Chan, Yuming Jiang, Ziwei Liu.
CVPR 2023. [PDF][Github][Github]

Text-guided Image-and-Shape Editing and Generation: A Short Survey.
Cheng-Kang Ted Chao, Yotam Gingold.
arXiv 2023.04. [PDF]

Video-P2P: Video Editing with Cross-attention Control.
Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia.
arXiv 2023.03. [PDF][Github]

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models.
Wen Wang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen.
arXiv 2023.03. [PDF]

Controlled and Conditional Text to Image Generation with Diffusion Prior.
Pranav Aggarwal, Hareesh Ravi, Naveen Marri, Sachin Kelkar, Fengbin Chen, Vinh Khuc, Midhun Harikumar, Ritiz Tambi, Sudharshan Reddy Kakumanu, Purvak Lapsiya, Alvin Ghouas, Sarah Saber, Malavika Ramprasad, Baldo Faieta, Ajinkya Kale.
arXiv 2023.02. [PDF]

Towards Enhanced Controllability of Diffusion Models.
Wonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David I. Inouye, Ajinkya Kale.
arXiv 2023.02. [PDF]

Region-Aware Diffusion for Zero-shot Text-driven Image Editing.
Nisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu.
arXiv 2023.02. [PDF]

Composer: Creative and Controllable Image Synthesis with Composable Conditions.
Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, Jingren Zhou.
arXiv 2023.02. [PDF][Github]

PRedItOR: Text Guided Image Editing with Diffusion Prior.
Hareesh Ravi, Sachin Kelkar, Midhun Harikumar, Ajinkya Kale.
arXiv 2023.02. [PDF]

SEGA: Instructing Text-to-Image Models Using Semantic Guidance.
Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting.
arXiv 2023.01. [PDF]

Uncovering The Disentanglement Capability in Text-to-Image Diffusion Models.
Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang.
arXiv 2022.12. [PDF]

SINE: SINgle Image Editing with Text-to-Image Diffusion Models.
Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren.
arXiv 2022.12. [PDF][Github]

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model.
Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang.
arXiv 2022.12. [PDF]

InstructPix2Pix: Learning to Follow Image Editing Instructions.
Tim Brooks, Aleksander Holynski, Alexei A. Efros.
arXiv 2022.11. [PDF][Link]

EDICT: Exact Diffusion Inversion Via Coupled Transformations.
Bram Wallace, Akash Gokul, Nikhil Naik.
arXiv 2022.11. [PDF]

Paint by Example: Exemplar-based Image Editing with Diffusion Models.
Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen.
arXiv 2022.11. [PDF][Github]

Diffusion Models Already Have A Semantic Latent Space.
Mingi Kwon, Jaeseok Jeong, Youngjung Uh.
ICLR2023. [PDF]

Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation.
Chaerin Kong, DongHyeon Jeon, Ohjoon Kwon, Nojun Kwak.
arXiv 2022.10. [PDF]

DiffEdit: Diffusion-based Semantic Image Editing with Mask Guidance.
Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord.
arXiv 2022.10. [PDF]

Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance.
Chen Henry Wu, Fernando De la Torre.
arXiv 2022.10. [PDF]

UniTune: Text-Driven Image Editing by Fine Tuning A Diffusion Model on A Single Image.
Dani Valevski, Matan Kalman, Eyal Molad, Eyal Segalis, Yossi Matias, Yaniv Leviathan.
SIGGRAPH 2023. [PDF]

More Control for Free! Image Synthesis with Semantic Diffusion Guidance.
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell.
arXiv 2021.12. [PDF][Github]

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations.
Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon.
arXiv 2021.08. [PDF][Github]

🚄 Parameter-Efficient Fine-Tuning

The NLP PEFT methods which have been proposed to diffusion models are marked by 📌, and the methods designed for diffusion are marked by 💎.

Parameter-efficient Is Not Sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions.
Dongshuo Yin, Xueting Han, Bin Li, Hao Feng, Jing Bai.
NeurIPS2023. [PDF]

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning.
Arnav Chavan, Zhuang Liu, Deepak Gupta, Eric Xing, Zhiqiang Shen.
arXiv 2023.06. [PDF][Github]

Visual Tuning.
Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen.
arXiv 2023.05. [PDF]

💎 A Closer Look at Parameter-Efficient Tuning in Diffusion Models.
Chendong Xiang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu.
arXiv 2023.03. [PDF][Github]

Towards Efficient Visual Adaption Via Structural Re-parameterization.
Gen Luo, Minglang Huang, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Zhiyu Wang, Rongrong Ji.
arXiv 2023.02. [PDF]

📌 DyLoRA: Parameter Efficient Tuning of Pre-trained Models Using Dynamic Search-Free Low-Rank Adaptation.
Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi.
arXiv 2022.10. [PDF] [Diffusion Impl.:KohakuBlueleaf/LyCORIS]

📌 Few-Shot Parameter-Efficient Fine-Tuning Is Better and Cheaper Than In-Context Learning.
Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel.
arXiv 2022.05. [PDF] [Diffusion Impl.:KohakuBlueleaf/LyCORIS]

📌 FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning.
Nam Hyeon-Woo, Moon Ye-Bin, Tae-Hyun Oh.
ICLR 2022. [PDF] [Diffusion Impl.:KohakuBlueleaf/LyCORIS]

📌 LoRA: Low-Rank Adaptation of Large Language Models.
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen.
arXiv 2021.06. [PDF]

The Power of Scale for Parameter-Efficient Prompt Tuning.
Brian Lester, Rami Al-Rfou, Noah Constant.
arXiv 2021.04. [PDF]

Prefix-Tuning: Optimizing Continuous Prompts for Generation.
Xiang Lisa Li, Percy Liang.
arXiv 2021.01. [PDF]

Parameter-Efficient Transfer Learning for NLP.
Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly.
arXiv 2019.02. [PDF]

Parameter-Efficient Transfer Learning for NLP.
Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly.
PMLR 2019. [PDF]

About

A collection of resources on personalization with diffusion models.

License:MIT License