githubharald / SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Home Page:https://towardsdatascience.com/2326a3487cd5

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Just asking. Is non-english chars allowed?

muhammedcanpirincci-sudo opened this issue · comments

Can i add my language's char's to charList.txt and train with my language's dataset?
Thanks.

you have to rewrite the data loader for your dataset and then train the model. the chars file is automatically created.
some hints on how to do this see: https://towardsdatascience.com/faq-build-a-handwritten-text-recognition-system-using-tensorflow-27648fb18519

so you basically have to rewrite this class so that it loads your dataset but still has the same interface as it has now:
https://github.com/githubharald/SimpleHTR/blob/master/src/dataloader_iam.py

so this part. : 1.2 Create IAM-compatible dataset and train model". I saw it but i just wanted to be sure. Thank you so much for your time sir.

yes, either make your dataset look like the IAM dataset and use the original dataloader.
Or write a new dataloader that loads your dataset and make the dataloader look like the original one.

Whatever you prefer.