marianna13 / OCR-Doc-parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OCR-Doc-parser

How it works

image

Usage

model = lp.models.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',
                                    extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],)


langs = ['en']
for f in files:

    doc_parser_tool(
        f,
        output_dir,
        pdf_dir,
        langs = langs
    )

Dockerfile

There's a Dockerfile available for using the doc parser.

About


Languages

Language:Python 59.0%Language:Dockerfile 41.0%