renard314 / textfairy

Android OCR App

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding Santali OCR

Prasanta-Hembram opened this issue · comments

I have recently used tesseract for sat.traineddata and i think it may work with tesseract 3.0, can you add it in textfairy.


Thanks, I will update the language list in my next or next next update!

@renard314 , with some effort i have tried to create santali tessdata please have a look.

Will it work with Tess4? Have you tried Devanagari script from to see if it performs any better?

Actually, Santali is written in Ol Chiki infact in Santali wikipedia we use Ol Chiki instead of Devnagri. Then also i tried Devnagri but it does'nt give Ol Chiki. The mine link i sent you was only supporting 3.0. I have tried it in gImageReader3rd party and got excellent result with this tessdata link, it is working in 4.0 also. I think you can use link.

@renard314 Result after using this. It is @rkvsraman tess data, which is working perfectly fine in detecting texts from image for Santali.
344444Capture copy

My test image below -

hey, it would be silly to do but i have tried it in your app by replacing the sat.tessdata data file with eng.tessdata and renaming sat.tessdata. It is working nice, infact it is working fantastic. I have created a video in my YouTube channel explaing how to do it :-) , if you want i can remove that video at your wish ;-).

Thanks @renard314 for adding Santali Language.