Just created a SIMPLE OCR which helps to Digitize your Citizenship [ back side of Nepali citizenship as it is in English ]. I used Tesseract-OCR to build this.
- We need to grayscale as most OCR use black and white image
- I added thershold of 50 to convert any pixel below 50 to 0 and other to 255. It helps to capture word cleary
- Using filters to separate the characters from the background
- Apply contour detection to recognize the filtered characters
- Use mage classification to identify the characters
- OCR gives the raw text. We have to extract useful information from there.
- I used regex and other slicing for extracting information that I need.