xiaoerlaigeid/digit-detection-recognition

Digit Detection & Recognition

What is it?

Digit detection and recognition with AdaBoost and SVM.

How it works

Train a cascade classifier for detection. The cascade classifier in classifier/cascade.xml is trained with 7000 positive samples and 9000 negative samples in 10 stages.
Train a SVM with the MNIST database.
Detect the digits in the image.
For each detected region, scale them to the same size as the samples in MNIST, then use the trained SVM to recognize(classify) the digits. For better results we can deskew the images with their momentum first, then use the HOG descriptors for testing.

Dependencies

These scripts need python 2.7+ and the following libraries to work:

pillow(~2.8.1)
numpy(~1.9.0)
python-opencv(~2.4.11)
scikit-learn (~0.15.2) The simplest way to install all of them is to install python(x,y).

If you can't install python(x,y), You can install python, numpy and python-opencv seperately, then install pip and pillow.

Install python. Just use the installer from python's website
Install numpy. Just use the installer from scipy's website. (You don't need scipy to run this project, so you can just install numpy alone).
Install python-opencv. Download the release from its sourceforge site. (Choose the release based on your operating system, then choose version 2.4.11). The executable is just an archive. Extract the files, then copy cv2.pyd to the lib/site-packages folder on your python installation path.
Install pip. Download the script for installing pip, open cmd (or termianl if you are using Linux/Mac OS X), go to the path where the downloaded script resides, and run python get-pip.py
Install pillow. Run pip install pillow.
Install scikit-learn. Run pip install scikit-learn

If you are running the code under Linux/Mac OS X and the scripts throw AttributeError: __float__, make sure your pillow has jpeg support (consult Pillow's document) e.g. try:

sudo apt-get install libjpeg-dev
sudo pip uninstall pillow
sudo pip install pillow

If you have any problem installing the dependencies, contact the author.

How to generate the results

Enter the src directory, run

python main.py

It will use images(.jpg only) under test directory to produce the results. The results will show up in results directory. Results generated with OpenCV will have -cv in its filename and results generated with sklearn will have -sk in its filename.

Directory structure

.
├─ README.md
├─ doc (documentations, reports)
│   └── ...
├─ classifier (OpenCV cascade classifier)
│   ├── cascade.xml (the classifier parameter file)
│   └── ...
├─ MNIST (The MNIST database)
│   ├── train-images.idx3-ubyte
│   └── train-labels.idx1-ubyte
├─ test (test images)
│   └── ...
├─ results (the results)
│   └── ...
└─ src (the python source code)
    ├── detect.py (detection code)
    ├── load_labels.py (script to load MNIST data)
    ├── recognize.py (recognition code)
    └── main.py (generate the results)

About

Github repository
Author: Qiuyi Zhang
Time: Jul. 2015

xiaoerlaigeid / digit-detection-recognition

Digit Detection & Recognition

What is it?

How it works

Dependencies

How to generate the results

Directory structure

About

About

Languages