There are 0 repository under pagexml topic.
Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format
Simple app for visual editing of Page XML files
A template for creating a ground truth repo with the various functions and features: such as metadata creation, data analysis and presentation.
This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.
Extract and convert PubLayNet data to PageXml format
Small collection of HTR/PageXML related scripts used at the ZPD Würzburg
Toolset for Tesseract training with PageXML Ground-Truth
This module provides access to Transkribus PageXML files via Xquery functions. It is designed to be used in context of a Basex xml database, but should work with other xml databases as well.