MKFMIKU / Instance-Shadow-Diffusion

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shadow Image Instance Mask Instance Removel (Ours)

Instance Shadow Diffusion

This repository contains the official implementation of the following paper:

Latent Feature-Guided Diffusion Models for Shadow Removal
Kangfu Mei 1, 2, Luis Figueroa 2, Zhe Lin 2, Zhihong Ding 2, Scott Cohen 2, Vishal M. Patel 1
1Johns Hopkins University
2Adobe Research

[Paper] [Code] [Pretrained Model] [Visual Results] [Demo Video πŸ”₯] [Live Demo πŸ”₯]

Introduction of Our Work

We propose the first instance-level shadow removal algorithm that can generating clear images or clear regions on shadow images (i.e. instance-level). It can preserve high-fidelity in generated results with the original shadow images.

The table below are performance comparison with our method and the previous non-diffusion based methods. We compare them with RMSE metric in shadow region / non-shadow region / all image, respectively. More results can be found here.

model AISTD ↓ ISTD ↓ SRD (shadow region) ↓
BMNet 5.69 / 2.52 / 3.02 7.44 / 4.61 / 5.06 7.40
Ours 5.15 / 2.47 / 2.90 6.41 / 4.65 / 4.93 6.81

Method Overview


Abstract: Recovering textures under shadows has remained a challenging problem due to the difficulty of inferring shadow-free scenes from shadow images. In this paper, we propose the use of diffusion models as they offer a promising approach to gradually refine the details of shadow regions during the diffusion process. Our method improves this process by conditioning on a learned latent feature space that inherits the characteristics of shadow-free images, thus avoiding the limitation of conventional methods that condition on degraded images only. Additionally, we propose to alleviate potential local optima during training by fusing noise features with the diffusion network. We demonstrate the effectiveness of our approach which outperforms the previous best method by 13% in terms of RMSE on the AISTD dataset. Further, we explore instance-level shadow removal, where our model outperforms the previous best method by 82% in terms of RMSE on the DESOBA dataset.

Detail Contents

  1. Setup & Dataset
  2. Training
  3. Testing
  4. Results
  5. Pretrain Models
  6. Citations
  7. License
  8. Acknowledgement

Setup & Dataset

To set up a Python virtual environment with the required dependencies, run:

# create virtual environment
python3 -m venv ./envs
source ./envs/bin/activate
# update pip, setuptools and wheel
pip3 install --upgrade pip setuptools wheel
# install all required packages
pip3 install -r requirements.txt

Install the guided-diffusion dependence, run:

git clone https://github.com/jychoi118/P2-weighting.git /tmp/p2w
pip install -e /tmp/p2w

*Setup huggingface accelerate for distributed training

accelerate config

To export the install requirements, run:

pip3 freeze > requirements.txt

Once done with virtual environment, deactivate with command:

deactivate

Dataset

We use the AISTD dataset for trainning the model. Random-crop augmentation is applied into the AISTD dataset and results in 26,120 patches for training.

Dataset Training Set Testing Set
AISTD (full shadow) (augmented patch) 26120 shadow / mask / shadow_free pairs
[Download]
540 shadow / mask / shadow_free pairs
[Download]
ISTD (full shadow) (augmented patch) 26120 shadow / mask / shadow_free pairs
[Download]
540 shadow / mask / shadow_free pairs
[Download]
DeSOBA (instance shadow) - 160 shadow / 624 shadow masks
[Download]

Training

  1. Please download the training dataset and testing dataset (on-the-fly testing with 10 steps DDIM solver) corresponding to the task and then place them in the folder specified as follows
# downloading training data 
cd data && wget https://www.cis.jhu.edu/~kmei1/publics/shadow/datasets/aistd_train.zip
unzip aistd_train.zip && rm aistd_train.zip
cd ..

# downloading testing data
cd data && wget https://www.cis.jhu.edu/~kmei1/publics/shadow/datasets/aistd_test.zip
unzip aistd_test.zip && rm aistd_test.zip
cd ..

You are expected to see the following file structres otherwise you need to manually rename those directory into the correct one.

$ tree ./data --filelimit 3

./data
β”œβ”€β”€ aistd_test
β”‚   β”œβ”€β”€ mask  [540 entries exceeds filelimit, not opening dir]
β”‚   β”œβ”€β”€ shadow  [540 entries exceeds filelimit, not opening dir]
β”‚   └── shadow_free  [540 entries exceeds filelimit, not opening dir]
└── aistd_train
    β”œβ”€β”€ mask  [26120 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ shadow  [26120 entries exceeds filelimit, not opening dir]
    └── shadow_free  [26120 entries exceeds filelimit, not opening dir]

9 directories, 0 files
  1. Follow the instructions below to train our model on full-shadow removal.
NCCL_P2P_DISABLE=1 CUDA_VISIBLE_DEVICES=2,3,4,5,6,7,8,9 accelerate launch --multi_gpu shadow_aistd_train.py

Testing

It will generate the shadow-free images in the experiments directory and then calculate the PSNR values at the same time. You will need to manually modify the checkpoint path in accelerator.load_state('experiments/state_244999.bin').

# Deshadow
NCCL_P2P_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch --multi_gpu shadow_aistd_test.py

For more detailed metrics calculation such as evaluating RMSE and SSIM in shadow region, non-shadow region, and all-images, please use evaluation_scripts/aistd_mae_evaluation.ipynb after generating deshadowed image.

Results

  • Our full-shadow removal results: AISTD | ISTD | SRD

  • Our instance-level shadow removal results: DeSOBA

Pretrain Models

You can find our pretrained models in here.

Citations

You may want to cite:

@article{mei2024shadow,
  title={Latent Feature-Guided Diffusion Models for Shadow Removal},
  author={Mei, Kangfu and Figueroa, Luis and Lin, Zhe and Ding, Zhihong and Cohen, Scott and Patel, Vishal},
  year={2024}
}

License

This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.

Acknowledgement

This detailed READEME is inspired by SRFormer.

About

License:Other


Languages

Language:Python 90.4%Language:Jupyter Notebook 9.6%