Awesome Generative Image Composition

A curated list of resources including papers, datasets, and relevant links pertaining to generative image composition, which aims to generate plausible composite images based on a background image (optional bounding box) and a (resp., a few) foreground image (resp., images) of a specific object.

Contributing

Contributions are welcome. If you wish to contribute, feel free to send a pull request. If you have suggestions for new sections to be included, please raise an issue and discuss before sending a pull request.

Survey

A brief review on generative image composition is included in the following survey on image composition:

Li Niu, Wenyan Cong, Liu Liu, Yan Hong, Bo Zhang, Jing Liang, Liqing Zhang: "Making Images Real Again: A Comprehensive Survey on Deep Image Composition." arXiv preprint arXiv:2106.14490 (2021). [arXiv]

Evaluation Metrics

Composite-Image-Evaluation

Test Set

COCOEE (within-domain, single-ref): 500 background images from MSCOCO validation set. Each background image has a bounding box and a foreground image from MSCOCO training set.
TF-ICON test benchmark (cross-domain, single-ref): 332 samples. Each sample consists of a background image, a foreground image, a user mask, and a text prompt.
FOSCom (within-domain, single-ref): 640 background images from Internet. Each background image has a manually annotated bounding box and a foreground image from MSCOCO training set.
DreamEditBench (within-domain, multi-ref): 220 background images and 30 unique foreground objects from 15 categories.
MureCom (within-domain, multi-ref): 640 background images and 96 unique foreground objects from 32 categories.

Leaderboard

The training set is open. The test set is COCOEE benchmark.

Method	Foreground			Background		Overall
Method	CLIP↑	DINO↑	FID↓	LSSIM↑	LPIPS↓	FID↓	QS↑
Inpaint&Paste	-	-	8.0	-	-	3.64	72.07
SDEdit	85.02	-	9.77	0.630	0.344	6.42	75.20
PBE	84.84	-	6.24	0.823	0.116	3.18	77.80
ObjectStitch	85.97	-	6.86	0.825	0.116	3.35	76.86
ControlCom	88.31	-	6.28	0.826	0.114	3.19	77.84

Papers

Object-to-Object

Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Re, Kayvon Fatahalian: "Collage Diffusion." WACV (2024) [pdf] [code]
Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan: "CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models." arXiv preprint arXiv:2310.19784 (2023) [arXiv] [code]
Bo Zhang, Yuxuan Duan, Jun Lan, Yan Hong, Huijia Zhu, Weiqiang Wang, Li Niu: "ControlCom: Controllable Image Composition using Diffusion Model." arXiv preprint arXiv:2308.10040 (2023) [arXiv] [code] [demo]
Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao: "AnyDoor: Zero-shot Object-level Image Customization." CVPR (2024) [arXiv] [code] [demo]
Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa: "Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model." arXiv preprint arXiv:2306.07596 (2023) [arXiv] [code]
Roy Hachnochi, Mingrui Zhao, Nadav Orzech, Rinon Gal, Ali Mahdavi-Amiri, Daniel Cohen-Or, Amit Haim Bermano: "Cross-domain Compositing with Pretrained Diffusion Models." arXiv preprint arXiv:2302.10167 (2023) [arXiv] [code]
Shilin Lu, Yanzhu Liu, Adams Wai-Kin Kong: "TF-ICON: Diffusion-based Training-free Cross-domain Image Composition." ICCV (2023) [pdf] [code]
Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen: "Paint by Example: Exemplar-based Image Editing with Diffusion Models." CVPR (2023) [arXiv] [code] [demo]
Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga: "ObjectStitch: Generative Object Compositing." CVPR (2023) [arXiv] [code]
Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A. Efros, Krishna Kumar Singh: "Putting People in Their Place: Affordance-Aware Human Insertion into Scenes." CVPR (2023) [paper] [code]

Token-to-Object

Lingxiao Lu, Bo Zhang, Li Niu: "DreamCom: Finetuning Text-guided Inpainting Model for Image Composition." arXiv preprint arXiv:2309.15508 (2023) [arXiv] [code]
Tianle Li, Max Ku, Cong Wei, Wenhu Chen: "DreamEdit: Subject-driven Image Editing." TMLR (2023) [arXiv] [code]

Other Resources

Awesome-Image-Composition

About

A curated list of papers, code, and resources pertaining to generative image composition.

rayson-chan / Awesome-Generative-Image-Composition

Awesome Generative Image Composition

Contributing

Table of Contents

Survey

Evaluation Metrics

Test Set

Leaderboard

Papers

Object-to-Object

Token-to-Object

Related Topics

Foreground: 3D; Background: image

Foreground: 3D; Background: 3D

Foreground: video; Background: image

Foreground: video; Background: video

Other Resources

About