aleronarjun / Rektext

Android app trained using deep CNN's to recognize numerical digits.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rektext : Recognise text on Android

Android app trained using deep CNN's on the SVHN dataset, which takes an input from camera and (for now) recognises numerical digits 0 through 9.

Python 3 used for DL model.

Min Android SDK version: 15

Target Android SDK version: 26

Overview

Currently Rektext recognises digits from 0 to 9 (printed, real world digits are the best use case). I wanted to make a model and then be able to apply it in the real world without too much hasle hence the idea for making an app.

The MNIST dataset seemed to mainstream and also wasn't good for the use case here, hence SVHN is used.

The SVHN dataset is the next step from MNIST towards applying CNN's to real world applications, due to the nature of this dataset it seemed best suited for this app.

Rektext works best on real world printed images, and will perform better if the digits covers most of the image.

Demo images :

alt textalt textalt text alt textalt textalt text alt textalt textalt text alt text

Printed digits classification from SVHN on Android with TensorFlow.

Dataset used - pre requisite

The very popular SVHN dataset from Stanford is used to train the CNN model used. The dataset can be found here.

The download labelled "Format 2" has been used which comprises of 32x32 images of cropped digits. (in 3 ".mat" files)

These files are then used to make float32 arrays and eventually train the model.

IMP : Make sure you have the datasets downloaded in the specified directory for the python code to work.

Model overview

The images are first converted to greyscale to reduce training time and GPU load. The structure for the deep learning model is:

INPUT -> [CONV -> RELU -> CONV -> RELU -> POOL] -> DROPOUT -> [FC -> RELU] -> FC

It took about 3 minutes to run 50,000 iterations on a GTX 1070. This model can be easily tuned to recognise strings of digits as well. The output from the python notebook is the Protocol Buffer file, which is then imported into the assets directory of the Android app.

Dependencies

All included in the gradle.build file. (TensorFlow dependency for Android)

Usage

Ready to run project in the Android directory. Open with Android Studio.

Interacting with TensorFlow

To interact with TensorFlow you will need an instance of TensorFlowInferenceInterface, you can see more details about it here

Credits

There are multiple sources which helped me develop this app.

  • First and foremost this repo by Thomas is the core of the Deep Learning model.

  • Secondly this video by Siraj on Youtube is how I learnt to use TF on Android, export pb files and use them in Android. Big thanks to him, check out his videos if you're into ML/DL.

Any improvements/suggestions/queries are more than welcome.

About

Android app trained using deep CNN's to recognize numerical digits.


Languages

Language:Jupyter Notebook 98.6%Language:Java 1.4%