rmtheis / android-ocr

Experimental optical character recognition app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can not download tessdata

longnguyencmg opened this issue · comments

Please help to check this link to download tessdata. URL not found :(

http://tesseract-ocr.googlecode.com/files/

This project doesn't use that as a download location, it uses tesseract-ocr.googlecode.com/files/.

Oops, I misread your comment before. You're saying the trained data download links are broken in the app, right? Do you know of an alternative download location?

Hello, I'm not sure if this https://github.com/tesseract-ocr/tessdata is enough information?
Currently, I'm looking at your OcrInitAsyncTask.java, you need .traineddata and osd.traineddata. But I just can find .traineddata from the link above.

Hmm, OK. I think using Firebase or S3 would be better than hotlinking to Github.

Maybe we can include the English/OSD data in the application assets and link to Firebase for the rest.

Actually, I fixed it download .traindeddata from github directly without unzip, etc. But I'm not sure if it's a good solution. I didn't try S3 or Firebase. Waiting for your solution 🎯 . Cheers

Is any other way to do this.... please its urgent

commented

Please help to check this link to download tessdata. URL not found :(

http://tesseract-ocr.googlecode.com/files/

Or help to how to package the appropriate training data files in the app ?

By checking the base URL of http://tesseract-ocr.googlecode.com it will redirect to a page that says

_tesseract-ocr has Moved!

This project has moved to a new location on the internet. Its new home is at:_

https://github.com/tesseract-ocr

Source files in the program could (should) be changed to reflect this, IMO.

I don't think hotlinking to Github is a good idea. I suggest packaging the data files in the app (like I've done with the English training data) or hosting the download yourself using Firebase.