MohamedWael / BasicArabicOCR

A very basic Arabic OCR based on tesseract OCR engine written in Java.

Home Page:http://mohamedwael.github.io/BasicArabicOCR/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BasicArabicOCR

A very basic Arabic OCR based on tesseract OCR engine written in Java.

How to run

Firstly, downlaad the following files and extract them.

Arabic OCR

Tess4J-2.0-src_2.zip

tesseract-ocr-3.02.ara.tar.gz

Secondly, open the project "Arabic OCR" using the NetBeans IDE and then right click on the Libraries directory --> add jar/folder browse to the lib directory in the tess4j project and add the fowllowing jar files

  • ghost4j-0.5.1.jar
  • jai_imageio.jar
  • jna.jar
  • win32-x86-64

repeat the previous process to add the "tess4j.jar" file located in the Tess4J\dist directory

finally, open the class "ProcessImage.java" and find the "instance.setDatapath" using ctrl+f and paste the path of the tessdata directory located in the tesseract-ocr\tessdata

About

A very basic Arabic OCR based on tesseract OCR engine written in Java.

http://mohamedwael.github.io/BasicArabicOCR/


Languages

Language:Java 100.0%