This is an implementation of CTPN on keras and Tensorflow. The project is based on matterport/Mask_RCNN and eragonruan/text-detection-ctpn. Thanks for their hard work.
- I trained by concatenating two different datasets (icdar 2019 art and rctw 2017). There are 8964 images for training and 2232 images for validation.
- The best evaluation result is
recall | precision | hmean |
---|---|---|
0.6886 | 0.8677 | 0.7678 |
- Pretrained model is here. baidu netdisk extract code: ezcj
python3 inference.py
to test your image.
- First you need to split the annotations into small bboxes whose width is 8px and save them into a txt file for each image.
python3 split.py
to split annotations in icdar2019 art dataset.python3 split_rctw.py
to split annotations in rctw 2017 dataset.python3 build_dataset.py
to save the images and annotations into hdf5 files so that the model can be trained very fast.
python3 train.py
to start training.
python3 eval.py
to evaluate.
- data augmentation
- deep backbone network
- fpn