blackprotoss / GSDM

Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)

Shipeng Zhu, Pengfei Fang, Chenjie Zhu, Zuoyan Zhao, Qiang Xu, Hui Xue

Paper: (arXiv 2401.14832), (AAAI-24)

This repository offers the official Pytorch code for this paper. If you have any questions, feel free to contact Shipeng Zhu (shipengzhu@seu.edu.cn) or Chenjie Zhu (chenjiezhu@seu.edu.cn).

Environment Setup

python pytorch cuda

  • Clone this repo

  • Create a conda environment and activate it.

  • Install related version Pytorch following

    conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
    
  • Install the required packages

  • Download the pre-trained checkpoints, and and move these files into the "checkpoints".

Inference phase

python inference.py --config xx --input_dir input --output_dir output --save_sp False
  • config: The path loading yaml file.
  • input_dir: The input image path.
  • output_dir: The Output image path.
  • save_sp: Whether to save structure prediction images.

Datasets and Pre-trained Checkpoints

Training phase

Step 1: Training SPM

python train_spm.py
  • Modify the training configuration in this file ——"config/train_spm.yaml"

Step 2: Training RM

python train_rm.py
  • Modify the training configuration in this file ——"config/train_rm.yaml"
  • Note that training RM requires a pre-trained SPM checkpoint, and the path should be modified in the above file.
  • Download the checkpoint of the pre-trained CRNN model into the path: "crnn/data/"

Citation

@inproceedings{zhu2024gsdm,
title={Text image inpainting via global structure-guided diffusion models},
author={Zhu, Shipeng and Fang, Pengfei and Zhu, Chenjie and Zhao, Zuoyan and Xu, Qiang and Xue, Hui},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={38},
number={7},
pages={7775-7783},
year={2024}
}

About

Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)

License:MIT License


Languages

Language:Python 97.9%Language:Lua 2.1%