Letters-and-Digits-classification

In this repository we can find three ipynb files, each file contains the code for the following:

The OFL_Assign_1.ipynb file contains the code for a basic binary classifier, which classifies if the input image contains a letter or a digit.
The OFL_Assign_2.ipynb file contains the code for a four class classifier, the four classes are even, odd, vowel and consonant. The classifier has to predicit what type does the input image contain in an end-to-end fashion.
The OFL_Assign_3.ipynb file contains a classifier to classify what letter or digit is present in the input image.

The dataset used for training these classifiers is emnist dataset which contains 47 classes (10 - digits, 26 - lower case letters and 11 - upper case letters). In the train set there are a total of 112,800 (2,400 per class) images and in the test set there are 18,800 (400 per class) images. The dataset can be downloaded from Here. Else you can directly download the processed npy files from Here and place them in npy_files directory.

The pre_processing.py file can be used to create the processed npy files which were used to train and test the models.

The accuracies obtained for each tasks are as follows

First task:

Train Test

94.61 93.53
Second task:

Train Test

93.63 92.79
Third task:

Train Test

92.10 89.71

Train	Test
94.61	93.53

Train	Test
93.63	92.79

Train	Test
92.10	89.71

The classwise_results_ques3.txt file contains the classwise results (i.e., precision, recall, f1-score for each class) for the 3rd task.

The pre-trained models can be loaded from the models directory, where there is a h5py file for each of the 3 tasks. simply use the following line and replace "model_name" with model that you wish to load

keras.models.load_model("./models/model_name.h5")

NOTE: No need to write any code for the model this line will directly load the trained mdoel with the pre-trained weights.

All the 3 models which were used for each induvidual tasks are fairly similar which contain CNN, BatchNorm, Leaky-ReLU, max-pooling, fully connected and softmax layers. The accuracy results shown in the above tables were obtained by doing hyper parameter tunning to best fit the data.

The last cell block in each ipynb files contains the code to see the results by selecting an image manually from the testset to see how the model works. There is still a lots of room to improve each of these models, especially the first and second as the data in those cases is not equally distributed. As the number of images for letters (both upper and lower case included) are more than that of digits.

chaitanya-basava / Letters-and-Digits-classification

Letters-and-Digits-classification

About

Languages