ngnquan / pytorchOCR

OCR algorithm library based on pytorch, including psenet, pan, dbnet, sast, crnn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OCR library based on pytorch

This link is for person which is new to OCR CSDN Blog


Recently updated:

  • 2020.12.22 Update CRNN+CTCLoss + CenterLoss training
  • 2020.09.18 Update text detection documentation
  • 2020.09.12 Update DB,pse,pan,sast,crnn Training test code and pre-trained model

Currently completed:


Next plan:

  • Model transfer onnx and test
  • Model compression (pruning)
  • Model compression (quantization)
  • Model distillation
  • Deploy tensorrt
  • Training generalized OCR model
  • Deploy with chinese_lite
  • Mobile deployment

Detect model result (in experiment)

The training is only on the ICDAR2015 text detection public data set, and the algorithm result is as follows:

Model Backbone Network Precision Recall Hmean Download Link
DB ResNet50_7*7 85.88% 79.10% 82.35% Download link(code:fxw6)
DB ResNet50_3*3 86.51% 80.59% 83.44% Download link(code:fxw6)
DB MobileNetV3 82.89% 75.83% 79.20% Download link(code:fxw6)
SAST ResNet50_7*7 85.72% 78.38% 81.89% Download link(code:fxw6)
SAST ResNet50_3*3 86.67% 76.74% 81.40% Download link(code:fxw6)
PSE ResNet50_7*7 84.10% 80.01% 82.01% Download link(code:fxw6)
PSE ResNet50_3*3 82.56% 78.91% 80.69% Download link(code:fxw6)
PAN ResNet18_7*7 81.80% 77.08% 79.37% Download link(code:fxw6)
PAN ResNet18_3*3 83.78% 75.15% 79.23% Download link(code:fxw6)

Model compression pruning result

Here, mobilev3 is used as the backbone. As a result of testing on icdar2015, the initial size of the uncompressed model is 2.4M.

  1. Compress the backbone
Model Pruned method Ratio Model size(M) Precision Recall Hmean
DB no 0 2.4 84.04% 75.34% 79.46%
DB backbone 0.5 1.9 83.74% 73.18% 78.10%
DB backbone 0.6 1.58 84.46% 69.90% 76.50%
  1. Compress the entire model
Model pruned method ratio model size(M) precision recall Hmean
DB no 0 2.4 85.70% 74.77% 79.86%
DB total 0.6 1.42 82.97% 75.10% 78.84%
DB total 0.65 1.15 85.14% 72.84% 78.51%

Model distillation

Model Teacher Student Model size(M) Precision Recall Hmean Improve(%)
DB no mobilev3 2.4 85.70% 74.77% 79.86% -
DB resnet50 mobilev3 2.4 86.37% 77.22% 81.54% 1.68
DB no mobilev3 1.42 82.97% 75.10% 78.84% -
DB resnet50 mobilev3 1.42 85.88% 76.16% 80.73% 1.89
DB no mobilev3 1.15 85.14% 72.84% 78.51% -
DB resnet50 mobilev3 1.15 85.60% 74.72% 79.79% 1.28

Documentation tutorial


Text detection result


Questions and exchanges plus WeChat

WeChat number: -fxwispig-


Reference

About

OCR algorithm library based on pytorch, including psenet, pan, dbnet, sast, crnn


Languages

Language:C++ 78.8%Language:Python 21.1%Language:Shell 0.1%Language:Makefile 0.0%Language:Objective-C 0.0%