An Optical Character Recognition(OCR) for the devanagri script using Convolutional Neural Networks(CNNs).
The image may contain more than one character. Therefore this is a case of multi-label multi-class classification.
- The Output layer of net contains 128 nodes corresponding to each devanagri character.
- The input layer is a Convolutional layer taking images of dimension 64*64
- The hidden layer consists of Convlational layers and Max-Pooling layers with a ReLu activation function
- A sigmoid activation function is used in the last layer, with a threshold value of 0.5 per character. Softmax isn't suited for multi-label classification(read).
- A Gaussian blur filter is used for noise removal.
- Binarization - Images are converted to grayscale then OTSU's method is used for grayscale to binary image conversion.
- The images are resized to 64*64 dimension.
The Keras Image Generator is used to augment training data by rotating and shifting transformations on the fly.
Using the model an accuracy of 82% was achieved.