UK Web Archive's repositories
webarchive-discovery
WARC and ARC indexing and discovery tools.
docker-pdf2htmlex
Run pdf2htmlEX in a Docker container.
ukwa-manage
Shepherding our web archives from crawl to access.
ukwa-heritrix
The UKWA Heritrix3 custom modules and Docker builder.
acid-crawl
An acid test suite for crawlers.
ukwa-services
Deployment configuration for all UKWA services stacks.
webrender-puppeteer
Web page rendering service based on Google's Puppeteer
docker-airflow
Apache Airflow with a few additional dependencies
docker-hadoop
Hadoop running in a container.
python-w3act
Python clients for W3ACT and Heritrix3
crawl-log-viewer
A simple web service for viewing crawl logs.
crawl-streams
Tools for working with UKWA crawler event streams
docker-clamd
ClamD in a container
docker-robot-framework
A Dockerised Robot Framework execution environment.
docker-superset
Dockerized Apache Superset including Solr module
npld-access-stack
Service deployment setup for the Reading Room NPLD Access service
npld-player
Secured browser for accessing NPLD content in Legal Deposit Library reading rooms.
ukwa-monitor
Dashboard and monitoring system for the UK Web Archive
ukwa-notebook-apps
UKWA web apps for working with internal APIs, build on Jupyter notebooks and Voila.
ukwa-reports
Generating Reports
ukwa-ui-collections-solr
Containerised version of the Solr service used to generate the UKWA UI collections browser