krishnacharya / OCR-devnagri

A convolutional neural network based OCR for the devnagri script

About

An Optical Character Recognition(OCR) for the devanagri script using Convolutional Neural Networks(CNNs).

Requirements

The image may contain more than one character. Therefore this is a case of multi-label multi-class classification.

Neural-Net Architecture

The Output layer of net contains 128 nodes corresponding to each devanagri character.
The input layer is a Convolutional layer taking images of dimension 64*64
The hidden layer consists of Convlational layers and Max-Pooling layers with a ReLu activation function
A sigmoid activation function is used in the last layer, with a threshold value of 0.5 per character. Softmax isn't suited for multi-label classification(read).

Preprocessing

A Gaussian blur filter is used for noise removal.
Binarization - Images are converted to grayscale then OTSU's method is used for grayscale to binary image conversion.
The images are resized to 64*64 dimension.

Data augmentation

The Keras Image Generator is used to augment training data by rotating and shifting transformations on the fly.

Results

Using the model an accuracy of 82% was achieved.

About

A convolutional neural network based OCR for the devnagri script

Languages

Language:Python 100.0%