captain-pool / image-correction-gan

GAN built as part of interview for applied computer vision research engineering intern @ rephrase.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Corrrection GAN

This GAN was built as a part of internship hiring problem for rephrase.ai. The task was, given a set of degraded images, degraded using some unknown function, one has to build a GAN to fix the images and bring them as closer to the ground truth as possible.

Solution Implemented

To tackle this problem, I used a resnet like architecture in the generator and a VGG like architecture as the discriminiator. And trained the model on a compound loss function consisting of pixelwise loss, feature loss obtained using a pretrained VGG16 model, and finally the adversarial loss.

Generator Architecture

The generator is having a ResNet like architecture with skip connections of length 3. Each residual block consists of 3 convoltion blocks activated using LeakyReLU.

Discriminator Architecture

The discriminator network is having a VGG like architecture having Convolutions, Batch Norm activated with LeakyReLU

Loss Function Used

The loss function of the generator consists of a linear combination 3 losses.

  • Pixelwise MSE with respect to the ground truth
  • L1 Perceptual loss on features obtained from the last convolution of VGG16 model
  • Adversarial component of the generator equation

Evaluation Metric Used: Peak signal to Noise Ratio (PSNR)

Hyper Parameters used

For the following set of hyperparameters, Mean PSNR of 36.166 was achieved by the end of 10,000 steps

Parameters Value
Image Patch Size 128 x 128
residual_scaling (Residual Scaling) 0.1
lrG 0.0001
lrD 0.0004
eta eta_value
gamma 0.7

Sample

From left to right: Corrupted, Reconstructed, Original sample

About

GAN built as part of interview for applied computer vision research engineering intern @ rephrase.ai


Languages

Language:Jupyter Notebook 100.0%