Pinkuburu / APISR

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

APISR aims at restoring and enhancing low-quality low-resolution anime images and video sources with various degradations from real-world scenarios.

Arxiv   HF Demo

πŸ‘€ Visualization | πŸ”₯ Update | πŸ”§ Installation | 🏰 Model Zoo | ⚑ Inference | 🧩 Dataset Curation | πŸ’» Train

⭐ If you like APISR, please help star this repo. Thanks! πŸ€—

Visualization (Click them for the best view!) πŸ‘€

Update πŸ”₯πŸ”₯πŸ”₯

  • Release Paper version implementation of APISR
  • Release different upscaler factor weight (for 2x, 4x and more)
  • Gradio demo (maybe online)

Installation πŸ”§

git clone git@github.com:Kiteretsu77/APISR.git
cd APISR

# Create conda env
conda create -n APISR python=3.10
conda activate APISR

# Install Pytorch and other packages needed
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt


# To be absolutely sure that the tensorboard can execute. I recommend the following CMD from "https://github.com/pytorch/pytorch/issues/22676#issuecomment-534882021"
pip uninstall tb-nightly tensorboard tensorflow-estimator tensorflow-gpu tf-estimator-nightly
pip install tensorflow

# Install FFMPEG [Only needed for training and dataset curation stage; inference only does not need ffmpeg] (the following is for the linux system, Windows users can download ffmpeg from https://ffmpeg.org/download.html)
sudo apt install ffmpeg

Gradio Fast Inference ⚑⚑⚑

Gradio option doesn't need to prepare the weight from the user side but they can only process one image each time.

An online demo can be found at https://huggingface.co/spaces/HikariDawn/APISR.

python gradio_apisr.py

Regular Inference ⚑⚑

  1. Download the model weight from model zoo and put the weight to "pretrained" folder.

  2. Then, Execute

    python test_code/inference.py --input_dir XXX  --weight_path XXX  --store_dir XXX

    If the weight you download is paper weight, the default argument of test_code/inference.py is capable of executing sample images from "assets" folder

Dataset Curation 🧩

Our dataset curation pipeline is under dataset_curation_pipeline folder.

You can collect your own dataset by sending videos into the pipeline and get the least compressed and the most informative images from the video sources.

  1. Download IC9600 weight (ck.pth) from https://drive.google.com/drive/folders/1N3FSS91e7FkJWUKqT96y_zcsG9CRuIJw and place it at "pretrained/" folder (else, you can define a different --IC9600_pretrained_weight_path in the following collect.py execution)

  2. With a folder with video sources, you can execute the following to get a basic dataset (with ffmpeg installed):

    python dataset_curation_pipeline/collect.py --video_folder_dir XXX --save_dir XXX
  3. Once you get an image dataset with various aspect ratios and resolutions, you can run the following scripts

    Be careful to check full_patch_source && degrade_hr_dataset_path && train_hr_dataset_path (we will use these path in opt.py setting during training stage)

    In order to decrease memory utilization and increase training efficiency, we pre-process all time-consuming pseudo-GT (train_hr_dataset_path) at the dataset preparation stage.

    But in order to create a natural input for prediction-oriented compression, in every epoch, the degradation started from the uncropped GT (full_patch_source), and LR synthetic images are concurrently stored. The cropped HR GT dataset (degrade_hr_dataset_path) and cropped pseudo-GT (train_hr_dataset_path) are fixed in the dataset preparation stage and won't be modified during training.

    bash scripts/prepare_datasets.sh

Train πŸ’»

The whole training process can be done in one RTX3090/4090!

  1. Prepare a dataset (AVC/API) which follows step 2 & 3 in Dataset Curation

    In total, you will have 3 folders prepared before executing the following commands:

    --> full_patch_source: uncropped GT

    --> degrade_hr_dataset_path: cropped GT

    --> train_hr_dataset_path: cropped Pseudo-GT

  2. Train: Please check opt.py carefully to setup parameters you want (modifying Frequently Changed Setting is usually enough)

    Step1 (Net L1 loss training): Run

    python train_code/train.py 

    The trained model weights will be inside the folder 'saved_models' (same to checkpoints)

    Step2 (GAN Adversarial Training):

    1. Change opt['architecture'] in opt.py to "GRLGAN" and change batch size if you need. BTW, I don't think that, for personal training, it is needed to train 300K iter for GAN. I did that in order to follow the same setting as in AnimeSR and VQDSR, but 100K ~ 130K should have a decent visual result.

    2. Following previous works, GAN should start from L1 loss pre-trained network, so please carry a pretrained_path (the default path below should be fine)

    python train_code/train.py --pretrained_path saved_models/grl_best_generator.pth 

Related Projects

  1. Fast Anime SR acceleration: https://github.com/Kiteretsu77/FAST_Anime_VSR
  2. My previous paper (VCISR - WACV2024) as the baseline method: https://github.com/Kiteretsu77/VCISR-official

Citation

Please cite us if our work is useful for your research.

@article{wang2024apisr,
  title={APISR: Anime Production Inspired Real-World Anime Super-Resolution},
  author={Wang, Boyang and Yang, Fengyu and Yu, Xihang and Zhang, Chao and Zhao, Hanbin},
  journal={arXiv preprint arXiv:2403.01598},
  year={2024}
}

Disclaimer

This project is released for academic use only. We disclaim responsibility for the distribution of the dataset. Users are solely liable for their actions. The project contributors are not legally affiliated with, nor accountable for, users' behaviors.

License

This project is released under the GPL 3.0 license.

Contact

If you have any questions, please feel free to contact me at hikaridawn412316@gmail.com or boyangwa@umich.edu.

About

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

License:GNU General Public License v3.0


Languages

Language:Python 99.8%Language:Shell 0.2%