nima7920/image-blending

{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "## Abstract\n",
    "In this project we implement two methods for image blending : Poisson method and multiresolution\n",
    "blending with Laplacian stack. Main functions for these methods are implemented in\n",
    "`poisson_blend.py` and `multiresolution_blend.py`, and the code for testing methods on\n",
    "sample images is in `main.py`"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### Poisson Blending\n",
    "Target and source images both have n pixels, and we also have a mask. We create a vector matrix b\n",
    "with n elements, each element corresponding to one pixel in target.\n",
    "If a pixel is in the mask, its corresponding element in b will be the Laplacian of\n",
    "source in that pixel. Otherwise, it will be the value of that pixel in target.\n",
    "After creating b, we also create an $n \\times n$ sparse matrix $A$,\n",
    "which is the coefficient matrix of the equations used for computing final image.\n",
    "Since $A$ is sparse, we use sparse matrix data structure implemented in `scipy.sparse` for\n",
    "working with it.\n",
    "Finally, we solve the least square error problem $Af=b$. Result $n \\times 1$ vector\n",
    "$f$ contains the value of each pixel in the blended image.\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import cv2\n",
    "import scipy.sparse.linalg as sparse_la\n",
    "from scipy import sparse\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "First function simply computes the Laplacian of an image:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def get_laplacian(img):\n",
    "    kernel = np.asarray([[0, -1, 0], [-1, 4, -1], [0, -1, 0]], dtype='float32')\n",
    "    result = cv2.filter2D(img, ddepth=-1, kernel=kernel)\n",
    "    return result\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Function `generate_matrix_b` creates matrix $b$, just in the way explained in the begining:\n",
    "\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def generate_matrix_b(source, target, mask):\n",
    "    source_laplacian_flatten = get_laplacian(source).flatten('C')\n",
    "    target_flatten = target.flatten('C')\n",
    "    mask_flatten = mask.flatten('C')\n",
    "    b = (mask_flatten) * source_laplacian_flatten + (1 - mask_flatten) * target_flatten\n",
    "    return b\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "`generate_matrix_A` creates coefficient matrix $A$:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def generate_matrix_A(mask):\n",
    "    data, cols, rows = [], [], []\n",
    "    h, w = mask.shape[0], mask.shape[1]\n",
    "    mask_flatten = mask.flatten('C')\n",
    "    zeros = np.where(mask_flatten == 0)\n",
    "    ones = np.where(mask_flatten == 1)\n",
    "    # adding ones to data\n",
    "    n = zeros[0].size\n",
    "    data.extend(np.ones(n, dtype='float32').tolist())\n",
    "    rows.extend(zeros[0].tolist())\n",
    "    cols.extend(zeros[0].tolist())\n",
    "\n",
    "    # adding 4s to data\n",
    "    m = ones[0].size\n",
    "    data.extend((np.ones(m, dtype='float32') * (4)).tolist())\n",
    "    rows.extend(ones[0].tolist())\n",
    "    cols.extend(ones[0].tolist())\n",
    "\n",
    "    # adding -1s\n",
    "    data.extend((np.ones(m, dtype='float32') * (-1)).tolist())\n",
    "    rows.extend(ones[0].tolist())\n",
    "    cols.extend((ones[0] - 1).tolist())\n",
    "\n",
    "    data.extend((np.ones(m, dtype='float32') * (-1)).tolist())\n",
    "    rows.extend(ones[0].tolist())\n",
    "    cols.extend((ones[0] + 1).tolist())\n",
    "\n",
    "    data.extend((np.ones(m, dtype='float32') * (-1)).tolist())\n",
    "    rows.extend(ones[0].tolist())\n",
    "    cols.extend((ones[0] - w).tolist())\n",
    "\n",
    "    data.extend((np.ones(m, dtype='float32') * (-1)).tolist())\n",
    "    rows.extend(ones[0].tolist())\n",
    "    cols.extend((ones[0] + w).tolist())\n",
    "    return data, cols, rows"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Next function solves the LSE problem $Af=b$ and returns $f$"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def solve_sparse_linear_equation(data, cols, rows, b, h, w):\n",
    "    sparse_matrix = sparse.csc_matrix((data, (rows, cols)), shape=(h * w, h * w), dtype='float32')\n",
    "    f = sparse_la.spsolve(sparse_matrix, b)\n",
    "    f = np.reshape(f, (h, w)).astype('float32')\n",
    "    return f"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Final function `blend_images` takes source,target and mask as input and applies blending, using above functions :"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def blend_images(source, target, mask):\n",
    "    h, w = source.shape[0], source.shape[1]\n",
    "    source_b, source_g, source_r = cv2.split(source)\n",
    "    target_b, target_g, target_r = cv2.split(target)\n",
    "    data, cols, rows = generate_matrix_A(mask)\n",
    "    b_b = generate_matrix_b(source_b, target_b, mask)\n",
    "    b_g = generate_matrix_b(source_g, target_g, mask)\n",
    "    b_r = generate_matrix_b(source_r, target_r, mask)\n",
    "    blended_b = solve_sparse_linear_equation(data, cols, rows, b_b, h, w)\n",
    "    blended_g = solve_sparse_linear_equation(data, cols, rows, b_g, h, w)\n",
    "    blended_r = solve_sparse_linear_equation(data, cols, rows, b_r, h, w)\n",
    "    result = cv2.merge((blended_b, blended_g, blended_r))\n",
    "    return result"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### Multiresolution Blending and Feathering\n",
    "We blend two given images using Laplacian stack and feathering.\n",
    "Algorithm is simple: first we create Gaussian stack for each image, then create\n",
    "Laplacian stack using Gaussian stack. We also create a stack of masks for blending.\n",
    "Then we blend the images in each level of the stack using their corresponding mask to get blended stack. Finally,\n",
    "We sum up all images in blended stack to get the final result.\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "First function simply uses GaussianBlur method in opencv to blur a given image :"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import cv2\n",
    "import matplotlib.pyplot as plt\n",
    "import scipy.signal as signal\n",
    "\n",
    "\n",
    "def blur_img(img, kernel_size, sigma):\n",
    "    result = cv2.GaussianBlur(img, (kernel_size, kernel_size), sigma)\n",
    "    return result"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Next two function generate Gaussian and Laplacian stack for a given image :"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def generate_gaussian_stack(img, kernel_size, sigma, stack_size):\n",
    "    result = []\n",
    "    result.append(img)\n",
    "    temp_img = img.copy()\n",
    "    for i in range(1, stack_size):\n",
    "        temp_img = blur_img(temp_img, kernel_size, sigma)\n",
    "        result.append(temp_img)\n",
    "    return result\n",
    "\n",
    "def generate_laplacian_stack(img, kernel_size, sigma, stack_size):\n",
    "    gaussian_pyramid = generate_gaussian_stack(img, kernel_size, sigma, stack_size)\n",
    "    result = []\n",
    "    for i in range(0, stack_size - 1):\n",
    "        temp = gaussian_pyramid[i].copy() - gaussian_pyramid[i + 1].copy()\n",
    "        result.append(temp)\n",
    "    result.append(gaussian_pyramid[-1].copy())\n",
    "    return result\n",
    "\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Function `generate_mask` creates a stack of masks. Each element of the stack is the result of\n",
    "blurring previous element, so in higher levels, we will have more intensive feathering"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def generate_masks(mask, kernel_size, sigma, size):\n",
    "    result = []\n",
    "    new_mask = blur_img(mask, kernel_size, sigma)\n",
    "    result.append(new_mask)\n",
    "    for i in range(size):\n",
    "        new_mask = blur_img(new_mask, kernel_size, sigma)\n",
    "        # plt.imshow(new_mask)\n",
    "        # plt.show()\n",
    "        result.append(new_mask)\n",
    "\n",
    "    return result\n",
    "\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Next function creates a stack of blended images, in the way described\n",
    "in the begining :"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def generate_blended_stack(img1, img2, masks, kernel_size, sigma, stack_size):\n",
    "    result = []\n",
    "    laplacian_stack1 = generate_laplacian_stack(img1, kernel_size, sigma, stack_size)\n",
    "    laplacian_stack2 = generate_laplacian_stack(img2, kernel_size, sigma, stack_size)\n",
    "    for i in range(stack_size):\n",
    "        blended_img = laplacian_stack1[i] * masks[i] + laplacian_stack2[i] * (1 - masks[i])\n",
    "        result.append(blended_img)\n",
    "    return result"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Final function `blend_images`, takes two images, a stack of masks, a kernel size and sigma for blurring and stack size\n",
    "as input, and outputs the resulting blended image:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def blend_images(img1, img2, masks, kernel_size, sigma, stack_size):\n",
    "    blended_stack = generate_blended_stack(img1, img2, masks, kernel_size, sigma, stack_size)\n",
    "    result = blended_stack[0].copy()\n",
    "    for i in range(1, stack_size):\n",
    "        result = result + blended_stack[i]\n",
    "    return result\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Function `convert_from_float32_to_uint8` is used to scale an image\n",
    "from interval [0,1] to interval [0,255], with integer valued pixels :"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "def convert_from_float32_to_uint8(img):\n",
    "    max_index, min_index = np.max(img), np.min(img)\n",
    "    a = 255 / (max_index - min_index)\n",
    "    b = 255 - a * max_index\n",
    "    result = (a * img + b).astype(np.uint8)\n",
    "    return result"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### main.py\n",
    "In the main file, after reading images, we scale them to the range [0,1], apply above functions on them and rescaling the result\n",
    "back to range [0,255]. This process is done both for sample images used in Poisson blending,\n",
    "and multiresolution blending."
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Test code for Poisson blending:\n"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import cv2\n",
    "import poisson_blend\n",
    "import multiresolution_blend\n",
    "\n",
    "''' applying poisson blending '''\n",
    "# reading images and scaling them to [0,1]\n",
    "img1 = cv2.imread(\"inputs/cards.jpg\")\n",
    "img2 = cv2.imread(\"inputs/desk.jpg\")\n",
    "\n",
    "img1 = (img1 / 255).astype('float32')\n",
    "img2 = (img2 / 255).astype('float32')\n",
    "\n",
    "source = img1[195:485, 355:715, :].astype('float32')\n",
    "target = img2[195:485, 355:715, :].astype('float32')\n",
    "\n",
    "mask = np.zeros((290, 360), dtype='float32')\n",
    "mask[5:-5, 5:-5] = 1\n",
    "\n",
    "max_matrix, min_matrix = np.ones(source.shape, dtype='float32') * 255, np.zeros(source.shape, dtype='float32')\n",
    "result = poisson_blend.blend_images(source, target, mask) * 255\n",
    "result = np.minimum(result, max_matrix)\n",
    "result = np.maximum(result, min_matrix)\n",
    "result = result.astype('uint8')\n",
    "final_result = (img2.copy() * 255).astype('uint8')\n",
    "final_result[195:485, 355:715, :] = result\n",
    "cv2.imwrite(\"outputs/cards-on-desk.jpg\", final_result)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Test code for multiresolution blending:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "''' applying multiresolution blending '''\n",
    "img1 = cv2.imread(\"inputs/tree_spring.jpg\")\n",
    "img2 = cv2.imread(\"inputs/tree_fall.jpg\")\n",
    "\n",
    "img_float1 = (img1 / 255).astype('float32')\n",
    "img_float2 = (img2 / 255).astype('float32')\n",
    "\n",
    "kernel_size, sigma, n = 151, 30, 10\n",
    "img_list = multiresolution_blend.generate_gaussian_stack(img1, kernel_size, sigma, n)\n",
    "lap_list = multiresolution_blend.generate_laplacian_stack(img1, kernel_size, sigma, n)\n",
    "\n",
    "''' creating masks '''\n",
    "h, w = img1.shape[0], img1.shape[1]\n",
    "mask = np.zeros(img1.shape, dtype='float32')\n",
    "mask[:, 0:w//2] = 1\n",
    "masks = multiresolution_blend.generate_masks(mask, 151, 30, n)\n",
    "\n",
    "result = multiresolution_blend.blend_images(img_float1, img_float2, masks, kernel_size, sigma, n)\n",
    "result = multiresolution_blend.convert_from_float32_to_uint8(result)\n",
    "cv2.imwrite(\"outputs/tree-spring-fall.jpg\", result)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "You can see the results of above codes in outputs folder."
   ],
   "metadata": {
    "collapsed": false
   }
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
nima7920 / image-blending

About

Languages