AshkanGanj / Instance-segmentation-with-MaskRCNN-on-Custom-dataset

Training a model for Instance segmentation and object detection with MaskRCNN with TensorFlow on a custom selected dataset from the open image.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

License: MIT made-with-python Made withJupyter

Instance-segmentation-with-MaskRCNN-on-Custom-dataset

In this project, I trained an architecture of convolutional neural network that was published in 2019. This model is well suited for instance and semantic segmentation. There is an option to use pre-trained weights. However, I took a step further and trained my own model using one of 600 classes from the Google Open Images dataset. I chose cat as segmentation object, because I love my cat :).

Dataset

image

A dataset of ~9 million varied images with rich annotations

The images are very diverse and often contain complex scenes with several objects (8.4 per image on average). It contains image-level labels annotations, object bounding boxes, object segmentations, visual relationships, localized narratives, and more. For downloading images I use python libary which is openimages, for downloding particular class of the dataset. I download the annotations in a xml format, so i should convert them to a bitmap format in order to enter them to MaskRCNN load_mask() function.

What is Image Segmentation

The computer vision task Image Segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as image objects). This segmentation is used to locate objects and boundaries (lines, curves, etc.).

There are 2 main types of image segmentation that fall under Mask R-CNN:

  • Semantic Segmentation
  • Instance Segmentation

Semantic Segmentation

Semantic segmentation classifies each pixel into a fixed set of categories without differentiating object instances. In other words, semantic segmentation deals with the identification/classification of similar objects as a single class from the pixel level.

image

As shown in the image above, all objects were classified as a single entity (person). Semantic segmentation is otherwise known as background segmentation because it separates the subjects of the image from the background.

Instance Segmentation

Instance Segmentation, or Instance Recognition, deals with the correct detection of all objects in an image while also precisely segmenting each instance. It is, therefore, the combination of object detection, object localization, and object classification. In other words, this type of segmentation goes further to give a clear distinction between each object classified as similar instances.

As shown in the example image above, for Instance Segmentation, all objects are persons, but this segmentation process separates each person as a single entity. Semantic segmentation is otherwise known as foreground segmentation because it accentuates the subjects of the image instead of the background.

Mask RCNN

Mask R-CNN is a Convolutional Neural Network (CNN) and state-of-the-art in terms of image segmentation. This variant of a Deep Neural Network detects objects in an image and generates a high-quality segmentation mask for each instance.

image

Mask R-CNN was built using Faster R-CNN. While Faster R-CNN has 2 outputs for each candidate object, a class label and a bounding-box offset, Mask R-CNN is the addition of a third branch that outputs the object mask. The additional mask output is distinct from the class and box outputs, requiring the extraction of a much finer spatial layout of an object.

image

Mask R-CNN is an extension of Faster R-CNN and works by adding a branch for predicting an object mask (Region of Interest) in parallel with the existing branch for bounding box recognition.

Results

Some results after running for 1 epoch

The model has some errors, which is ok because i trained it just for 1 epoch.

1

2

3

4

About

Training a model for Instance segmentation and object detection with MaskRCNN with TensorFlow on a custom selected dataset from the open image.

License:MIT License


Languages

Language:Jupyter Notebook 100.0%