attention-mechanism encoder-decoder imagecolorization machine-learning python recurrent-neural-networks tensorflow

Image-Colorization-Using-GANs

The translation of an input image into a corresponding output image is a common problem in image processing, graphics, and vision. Even though the scenario is always the same i.e. to map pixels to pixels, these challenges are frequently solved with application-specific techniques. Conditional adversarial nets are a general-purpose approach that looks to be effective for a wide range of these issues. Automatic image colorization has piqued attention for a variety of applications, including the restoration of aged or deteriorated photos.

Dataset

This Dataset contains 7129 colourful RGB images & 7129 grayscale images of landscapes in jpg image format. Images consists of streets, buildings, mountains, glaciers, trees etc and their corresponding grayscale image in two different folders.

Landscape color and grayscale images Dataset

Processing Images

Pre-Processing include common image related function such as loading, resizing, random cropping plus rotation etc. And some image color-space transformation function like rgb2lab and vice versa.

RGB -> LAB

@tf.function()
def rbg2lab(target_img, isnormalized = True, normalize_lab = False, comp = ''):
  if not isnormalized:
    target_img = target_img/255.0                                               # normalizes the rgb image in range 0-1
  
  # Takes RGB Image in Normalized Form
  target_img = tfio.experimental.color.rgb_to_lab(target_img)
  if normalize_lab:
    tf.Assert(tf.reduce_any(tf.equal(comp, ['vis', 'net'])), data=[comp], name='Lab_Normalization_Error')
    if comp == 'vis':
      target_img = (target_img + [0, 128, 128]) / [100., 255., 255.]            # normalizes in 0-1 range for visualization of image
    else:
      target_img = target_img / [50., 127.5, 127.5] + [-1, 0., 0.]              # normalizes in -1 to 1 range for neural networks as they perform better in this range
  return target_img

LAB -> RGB

@tf.function()
def lab2rgb(lab_img, isnormalized = False, comp = ''):
  if isnormalized:
    tf.Assert(tf.reduce_any(tf.equal(comp, ['vis', 'net'])), data=[comp], name='Lab_Normalization_Error')
    if comp == 'vis':
      lab_img = lab_img * [100.,255., 255.] + [0, -128, -128];                  # from 0-1 range
    else:
      lab_img = (lab_img + [1.,0., 0.]) * [50., 127.5, 127.5];                  # from -1 to 1 range
  
  # Takes LAB Image in Unnormalized Form.
  rgb = tfio.experimental.color.lab_to_rgb(lab_img) 
  return rgb

Model Architecture

Generator

The input (grayscale image) is transmitted through a number of layers that gradually downsample the data until it reaches a bottleneck layer, at which time the process is reversed and the latent representation is upsampled into an colored output image, we add skip connections in the shape of a "U-Net". We add skip connections between each layer i and layer n-i where n is the total number of layers, in particular. Each skip connection simply concatenates all layers i and n-i channels.

Discriminator

The discriminator attempts to determine whether each of the N x N patches in a picture is authentic or phoney. The final output of D is calculated by averaging all responses.

Custom GAN Model

This model combines both the generator and discriminator, calculates losses and matrices, performs gradient descent steps and displays results.

Gist Link : GAN Architecture

Results

About

Image Colorization Using GANs, translating an input grayscale image into colored one.

attention-mechanism encoder-decoder imagecolorization machine-learning python recurrent-neural-networks tensorflow

MIT License

Languages

Language:Jupyter Notebook 100.0%