HemingwayLee / ocr-box-editor-v2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ocr-box-editor-v2

  • This project is modified from tesseract-web-box-editor. WordStr boxfile format is supported in this project
  • This is a web application to generate training data for tesseract by the following steps
    • Upload images
    • Edit labels (text and bounding box coordinates) for the uploaded images
    • Save images and corresponding labels to backend
  • After we collect training data, we can retrain tesseract

prerequisite

  • install tesseract
  • install python3 and virtualenv

How to install

virtualenv venv
source venv/bin/activate
pip3 install -r requirements.txt 

How to run

python3 manage.py migrate
python3 manage.py runserver

db.sqlite3 will be created, and then, we can access http://127.0.0.1:8000

Editor

  • Upload images and click Process Image button to generate default labels
  • Edit labels (text and bounding box coordinates) for the uploaded images
  • Save images and corresponding labels (in .box file extension) to backend gui

Data Viewer

  • We can see all uploaded images in the backend by clicking Data tab gui2

About


Languages

Language:JavaScript 50.2%Language:Python 25.8%Language:HTML 22.1%Language:CSS 1.3%Language:Shell 0.6%