Digit detection and recognition with AdaBoost and SVM.
- Train a cascade classifier for detection. The cascade classifier in
classifier/cascade.xml
is trained with 7000 positive samples and 9000 negative samples in 10 stages. - Train a SVM with the MNIST database.
- Detect the digits in the image.
- For each detected region, scale them to the same size as the samples in MNIST, then use the trained SVM to recognize(classify) the digits. For better results we can deskew the images with their momentum first, then use the HOG descriptors for testing.
These scripts need python 2.7+ and the following libraries to work:
- pillow(~2.8.1)
- numpy(~1.9.0)
- python-opencv(~2.4.11)
- scikit-learn (~0.15.2) The simplest way to install all of them is to install python(x,y).
If you can't install python(x,y), You can install python, numpy and python-opencv seperately, then install pip and pillow.
- Install python. Just use the installer from python's website
- Install numpy. Just use the installer from scipy's website. (You don't need scipy to run this project, so you can just install numpy alone).
- Install python-opencv. Download the release from its sourceforge site. (Choose the release based on your operating system, then choose version 2.4.11). The executable is just an archive. Extract the files, then copy
cv2.pyd
to thelib/site-packages
folder on your python installation path. - Install pip. Download the script for installing pip, open cmd (or termianl if you are using Linux/Mac OS X), go to the path where the downloaded script resides, and run
python get-pip.py
- Install pillow. Run
pip install pillow
. - Install scikit-learn. Run
pip install scikit-learn
If you are running the code under Linux/Mac OS X and the scripts throw AttributeError: __float__
, make sure your pillow has jpeg support (consult Pillow's document) e.g. try:
sudo apt-get install libjpeg-dev
sudo pip uninstall pillow
sudo pip install pillow
If you have any problem installing the dependencies, contact the author.
Enter the src
directory, run
python main.py
It will use images(.jpg
only) under test
directory to produce the results. The results will show up in results
directory. Results generated with OpenCV will have -cv
in its filename and results generated with sklearn will have -sk
in its filename.
.
├─ README.md
├─ doc (documentations, reports)
│ └── ...
├─ classifier (OpenCV cascade classifier)
│ ├── cascade.xml (the classifier parameter file)
│ └── ...
├─ MNIST (The MNIST database)
│ ├── train-images.idx3-ubyte
│ └── train-labels.idx1-ubyte
├─ test (test images)
│ └── ...
├─ results (the results)
│ └── ...
└─ src (the python source code)
├── detect.py (detection code)
├── load_labels.py (script to load MNIST data)
├── recognize.py (recognition code)
└── main.py (generate the results)
- Github repository
- Author: Qiuyi Zhang
- Time: Jul. 2015