IDM 4 Data Science's repositories
sbb_textline_detection
Detect textlines in document images
sbb_binarization
Document Image Binarization
dinglehopper
An OCR evaluation tool
sbb_images
Image Annotation Tool and Image Search
sbb_ocr_postcorrection
Two-Step Approach to OCR Post-Correction
mods4pandas
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
sbb_pixelwise_segmentation
Obsolete repo, merged into eynollah
ocrd-galley
A Dockerized test environment for OCR-D processors 🚢
ocrd_repair_inconsistencies
Automatically re-order lines, words and glyphs to become textually consistent with their parents.
ocrd_trocr
OCR-D processor for TrOCR
publications
Qurator-SPK team publications
ocrd_calamari
Recognize text using Calamari OCR and the OCR-D framework
PyTorch-YOLOv3
Minimal PyTorch implementation of YOLOv3
sbb_knowledge-base
Wikidata + Wikipedia Knowledge-Base Extraction for EL-purposes
sbb_web-integration
Visualization of NER+EL+Topic Modelling + Image-Search
setuptools_ocrd
Manage your package version through ocrd-tool.json
abbyy-to-alto
Converts FineReader abbyy.xml to alto.xml.
download-gitter.im-chat
tiny tool to download gitter.im chat
sbb_ner_hf
sbb ner finetuning with huggingface
sbb_topic-modelling
Topic Modelling