Roderic Page's starred repositories
google-api-php-client
A PHP client library for accessing Google APIs
layout-model-training
The scripts for training Detectron2-based Layout Models on popular layout analysis datasets
ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
inat_comp_2018
CNN training code for iNaturalist 2018 image classification competition
Layout2Graph
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
hOCR-to-ALTO
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
RACplusplus
A high performance implementation of Reciprocal Agglomerative Clustering in C++
CNKICrawler
A crawler of CNKI. It collects data for NLP and other ML/DL experiment.
archive-hocr-tools
Efficient hOCR tooling
universal-citekey-js
Javascript implementation of universal cite key
docxToJats
DOCX to JATS XML Converter
off-the-shelf-insect-identification
Main contribution of this repo is thorough evaluation of off-the-shelf approach for image classification based on a feature extraction with a single feed forward pass trough pretrained VGG16.
TensorflowLite_Image_Classification_Training
Using Google Colab notebook to train Image Classification model with custom dataset.
jekyll-rdf
📃 A Jekyll plugin to include RDF data in your static site or build a complete site for your RDF graph
react-miller-columns
Miller columns for React
gpt-prompts
A repo for GPT prompts
side-by-side-browser
Proxy a web site to preview the redirections ahead of it being transitioned to GOV.UK
iHDTpp-src
iHDT++ source code