Praneet9 / Docify

A service for extracting text from ID cards in India, like Aadhar Card, PAN Card and Driving Licence. You just need to click and send a picture of the card to the API and get a json with your details. It was built using Flask, Deep Learning and Image Processing. It also uses Connectionist Text Proposal Network (Open Source) along with Tesseract for text extraction.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docify

Deep Learning based Flask api to extract details from Indian ID cards like Aadhar Card, PAN Card and Driving Licence.

Tech

Docify uses a number of open source projects to work properly:

Installation

Install Linux Dependencies

$ sudo apt install cmake
$ sudo apt install tesseract-ocr
$ sudo apt install mongodb
$ sudo apt install libsm6 libxext6
$ sudo apt install supervisor
$ sudo systemctl start mongo

Download Tesseract Models [ENG+HIN+MAR]

https://github.com/tesseract-ocr/tessdata_best
https://github.com/BigPino67/Tesseract-MICR-OCR

Install Python-Dependencies

$ pip3 install opencv-python easydict flask face_recognition gunicorn tensorflow keras pytesseract dlib imutils opencv-contrib-python pymongo PyYAML scikit-image scikit-learn

Start Python Api

python3 server.py

About

A service for extracting text from ID cards in India, like Aadhar Card, PAN Card and Driving Licence. You just need to click and send a picture of the card to the API and get a json with your details. It was built using Flask, Deep Learning and Image Processing. It also uses Connectionist Text Proposal Network (Open Source) along with Tesseract for text extraction.

License:MIT License


Languages

Language:Jupyter Notebook 55.1%Language:Python 43.5%Language:Cuda 0.9%Language:HTML 0.4%Language:C++ 0.0%Language:Shell 0.0%