riteshrajput / voter-id-text-extraction-ocr-pytesseract

Text extract from VoterID and automatically fetching details from electorial website.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

voter-id-text-extraction

Getting Started

  • Run "TextExtractVoterId.py" to extract information from the Voters ID photo.

  • Run "TextProcessing.py" to extract Voter ID information from textfile and obtain json file.

  • You will obtain "TextExtract.txt" and "Result.json" from running above two programs.

  • Before running the below file, edit the path of tesseract and chromedriver according to your system.

  • Run the "ScrapeVoterDetails.py" to scrape the data from website automatically.

  • If you receive an error - "TesseractNotFoundError: tesseract is not installed or it's not in your path"

1) Download tesseract and install it. Windows version is available here: "https://github.com/UB-Mannheim/tesseract/wiki"
2) Copy the path of the tesseract install and paste it line of code exact as below.
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

Installation

Use the package manager pip to install required libraries.

pip install numpy
pip install Pillow
pip install selenium
pip install pytesseract
pip install beautifulsoup4
pip install opencv-python

Environment

  • Python 3.6

Captcha Solver

Contributing

Please open an issue if you have any trouble or to discuss what you would like to change.

Authors

contact-info

Feel free to contact me to discuss any issues, questions, or comments.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

Text extract from VoterID and automatically fetching details from electorial website.

License:MIT License


Languages

Language:Python 100.0%