magesh-technovator / PubLayNet

MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PubLayNet

PubLayNet is a large dataset of document images, of which the layout is annotated with both bounding boxes and polygonal segmentations. For more information, see PubLayNet original

PMC4334925_00006.jpg PMC538274_00004.jpg

Recent updates

29/Feb/2020 - Add benchmarking for maskrcnn_resnet50_fpn.

22/Feb/2020 - Pre-trained Mask-RCNN model in (Pytorch) are released .

Benchmarking

Architecture Iter_num (x16) AP AP50 AP75 AP Small AP Medium AP Large MD5SUM
MaskRCNN-Resnet50-FPN 196k 0.91 0.98 0.96 0.41 0.76 0.95 393e6700095a673065fcecf5e8f264f7

Demo

Download trained weights in Benchmarking section above, locate it in maskrcnn directory

Run

python infer.py <path_to_image>

Avarage Precision in validation stages (via Tensorboard)

About

MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...


Languages

Language:Python 100.0%