A seven-segment digits OCR
We tried 2 different approaches:
The "digit-per-digit" approach
- Image processing: identify the screen, crop the picture, grayscale, thresholding, localize digits and crop them.
- Learning phase: learn a "MNIST" model that predicts each digit individually.
- Inference phase: pass each cropped digit to the "MNIST" model, and append the results.
The "end-to-end" approach
- Image processing: identify the screen, crop the picture, grayscale and thresholding.
- Learning phase: learn a model that predicts all digits at once. We based our model on the "Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks" paper by Goodfellow & al. The idea is to build a ConvNet that simultaneously learns (i) the digits and (ii) where to look for them.
How to reproduce our work and run our models
#1 Preprocessing python frame_extractor.py
#2 Preprocessing python digits_cut.py
#3 Model python main.py