There are 0 repository under alto-xml topic.
OCR engine for all the languages
Document Layout Analysis resources repos for development with PdfPig.
Conversions between various OCR formats
An OCR evaluation tool
Text Overlay plugin for Mirador 3
Python tools for performing various operations on ALTO XML files
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Library Suite.
Image Retrieval in Digital Libraries - A Multicollection Experimentation of Machine Learning techniques
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
Command Line Interface (CLI) to export METS/ALTO documents to other formats.
Convert ALTO XML to plain text + minimal metadata
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
Helper functions and web app for METS/ALTO archive viewing.
Create a searchable PDF with ALTO-XML and JP2 files.
a bunch of scripts to manipulate ALTO and XML/TEI
Dataset and models for catalogs' Layout analysis and HTR
ALTO XML coordinates highlighting application for validating the coordinates values
XSL stylesheets to convert between alto and other formats (hOCR, plain text...)
Scripts I wrote at my job which could be helpful to others
TIFF Image - Converted into OCR XML using Tesseract