3cham / VietOCR3

Java GUI frontend for Tesseract OCR engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VietOCR

A Java GUI frontend for Tesseract OCR engine. Supports optical character recognition for Vietnamese and other languages supported by Tesseract.

VietOCR is released and distributed under the Apache License, v2.0.

Features

  • Multi-platform
  • PDF, TIFF, JPEG, GIF, PNG, BMP image formats
  • Multi-page TIFF images
  • Screenshots
  • Selection box
  • File drag-and-drop
  • Paste image from clipboard
  • Text search and replace
  • Postprocessing for Vietnamese to boost accuracy rate
  • Vietnamese input methods
  • Localized user interface for many languages (Localization project)
  • Integrated scanning support
  • Watch folder monitor for support of batch processing
  • Custom text replacement in postprocessing
  • Spellcheck with Hunspell
  • Support for downloading and installing language data packs and appropriate spell dictionaries

Instructions

To launch the program from the command line:

java -jar VietOCR.jar

On Linux or Unix-based systems, you may need to set the proper locale first:

export LC_ALL=C

About

Java GUI frontend for Tesseract OCR engine


Languages

Language:Java 79.6%Language:HTML 20.1%Language:Groovy 0.2%Language:Shell 0.1%Language:Batchfile 0.0%