FisherChang / TSSR-GAN

Increasing Video Spatio‑Temporal Resolution with Temporally Coherent GANs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TSSR-GAN

Table of Contents

  1. Setup
  2. Inference
  3. Training

This respository contains code and material for the TSSR-GAN project, a GAN for Temporal and Spatial Super-Resolution. Check out this PDF for implementation details and comparisons to other state-of-the-art approaches. Shoutout to the authors of DAIN and TecoGAN. TSSR-GAN uses a lot of modules of these two networks.

Check out a 1080p version of this 4x spatial and 4x temporal upscaling example here, as well as many more frame-by-frame comparisons in the aforementioned PDF.

Comparison720.mp4

Included in this repository:

  • Setup guide
  • Scripts for data download and preparation
  • Scripts to train TSSR-GAN
  • Scripts for inference and benchmarking
  • Google Colab Notebook

Everything is implemented in Python 3 and tested for PyTorch 1.4 with CUDA 10.0 and CUDA 10.1 but should also run with PyTorch >= 1.4, it is just not tested. Python 2.x is not supported.


Setup

  1. Clone git git clone https://github.com/cestcedric/TSSR-GAN.git
  2. Create python environment and install packages listed in requirements.txt. These packages also contain packages for the metric computation and probably a few not necessary anymore. PyTorch version just has to be >= 1.4, but you have to match CUDA from PyTorch with the NVCC version.
  3. Compile PWCNet and DAIN modules (make sure that your PyTorch CUDA version and NVCC version match): scripts/setup.sh
  4. Download model weights: scripts/getModelWeights.sh

Alternatively to 3. you can install precompiled modules for PyTorch 1.4.0 with CUDA 10.0 on Python 3.6: scripts/setup_precompiled.sh Adaption to other Python versions should be fairly easy, just change the version in the script.


Inference

Add some sequence base frames and put them into this structure:

main_dir
    sequence_1
        start.png, end.png
    sequence_2
        start.png, end.png

Then start the super-resolution with:

python interpolate.py \
--input_dir /path/to/main_dir/ \
--output_dir /path/to/results \
--weights model_weights/pretrained/best_GEN.pth \
--timestep 0.5

A time step of 0.5 means one intermediate frame, so start <-0.5-> intermediate <-0.5-> end. For 9 intermediate frames use time step = 1/(intermediate frames + 1) = 1/(9+1) = 0.1.


Training

Don't train on Google Colab, unless you have a paid version or something, it takes way to long to get results before the automatic disconnection using the base version.

Anyway, loads of parameters that can be set for training, see utils/my_args.py or use !python3 train.py --help.

The training data can be saved randomly dispersed, all in one directory, whatever. BUT you have to provide a directory with trainlist.txt and validlist.txt, which look something like this for sequences with 3 frames:

seq1/frame1.png;seq1/frame2.png;seq1/frame3.png
seq2/frame1.png;seq2/frame2.png;seq2/frame3.png
seq3/frame1.png;seq3/frame2.png;seq3/frame3.png
seq4/frame1.png;seq4/frame2.png;seq4/frame3.png
seq5/frame1.png;seq5/frame2.png;seq5/frame3.png

Start training using:

! python3 train.py \
--pretrained_generator path/to/pretrained.pth \
--numEpoch 20 \
--lengthEpoch 5000 \
--datasetPath path/to/dataset \
--max_img_height 512 \
--max_img_width 512 \
--interpolationsteps 5 \
--improved_estimation 1 \
--gan \
--depth \
--flow \
--spatial \
--temporal \
--loss_pingpong \
--loss_layer \
--loss_perceptual \
--debug_output \
--tb_experiment training \
--debug_output_dir training_output/training \
--warmup_discriminator \
--warmup_boost 4 \
--no_cudnn_benchmark

Interpolationsteps denotes the newly generated intermediate frames, which means interpolationsteps = sequence length - 2.


ENJOY!

About

Increasing Video Spatio‑Temporal Resolution with Temporally Coherent GANs


Languages

Language:Python 68.0%Language:Cuda 23.6%Language:C++ 6.5%Language:Jupyter Notebook 1.0%Language:Shell 0.9%