There are 0 repository under page-xml topic.
OCR engine for all the languages
Document Layout Analysis resources repos for development with PdfPig.
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Conversions between various OCR formats
An OCR evaluation tool
A powerful CLI tool for visualization and encoding of PAGE-XML files
Dataset and models for catalogs' Layout analysis and HTR
Automatically re-order lines, words and glyphs to become textually consistent with their parents.
The repo gt_structure_1_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
About The repo gt_structure_1_4 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
OCR-D guidelines for Ground Truth production
XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).
The repo gt_structure_1_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.
OCR-D wrapper for page-xml-draw