koushal95 / Classification-on-OCR-dataset

Evaluation of different classification techniques on OCR data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Classification on OCR dataset - Handwritten Letter Recognition

-My effort in exploring the implementations of classification techniques.

The Dataset:

The dataset contains handwritten letter images that are normalized and converted into 16 X 8 pixels and a label feature that denotes the letter in the image. There are other features(next_id, word_id, position, fold) that are not relevant and not used in the classification process implemented. These features are not relevant because this dataset is taken from another dataset that contains images of 'words' and not letters.

The Classifiers:

Support Vector Machine

Logistic Regression

Naive Bayes

Decision Tree (Classification And Regression Trees)

About

Evaluation of different classification techniques on OCR data


Languages

Language:Jupyter Notebook 99.0%Language:Python 1.0%