Calamari-OCR / calamari

Line based ATR Engine based on OCRopy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

options used for the command train of calamari OCR

Tailor2019 opened this issue · comments

Hello!
@ChWick @andbue
Please there is many options for this command(https://calamari-ocr.readthedocs.io/en/latest/doc.command-line-usage.html#calamari-train) . Are all these options usable for the training from scratch and the training from a pretrained model?
Which of these options preferable to use with the command train in the 2 cases(from scratch, from a pretrained model) in order to get best results of recognition?
Thanks in advance!

The parameters are already optimized for best results. You might be able to achieve a little lower CER by using larger networks, but only at the cost of longer training. For some insights in successful training procedures, have a look at this or that paper.

Besides from the parameters mentioned here, most parameters work for both training from scratch and warm starting. Setting parameters for network architecture does not work when starting with a pretrained model, obviously.

In everyday use I tend to set --n_augmentations=5 and train a set of 5 models using calamari-cross-fold-train.