UK Web Archive (ukwa)

UK Web Archive

ukwa

Geek Repo

Location:United Kingdom

Home Page:http://www.webarchive.org.uk/

Github PK Tool:Github PK Tool

UK Web Archive's repositories

webarchive-test-suite

A set of test files for web archiving.

Language:ArcStargazers:8Issues:0Issues:0

flashfreeze

A rapid web page analyser and archiver.

Language:PythonStargazers:6Issues:0Issues:0

halflife

Tracking the fortunes of our archived URLs.

Language:Jupyter NotebookStargazers:5Issues:0Issues:0

warc

Python library for reading and writing warc files

Language:PythonLicense:GPL-2.0Stargazers:3Issues:0Issues:0

webarchive-wat-mining

WAT (web archive transform) metadata mining

Language:ShellStargazers:3Issues:15Issues:0

javaswf

Mavenised version of the JavaSWF codebase, in order to resolve the dependencies for Heritrix3.

Language:JavaStargazers:2Issues:0Issues:0

python-warcwriterpool

Hopefully off-setting some of the difficulties writing to WARCs (multiple open files, size limits, etc.).

Language:PythonLicense:Apache-2.0Stargazers:2Issues:5Issues:0

SentimentalJ

A sentiment analysis module for node.js

Language:JavaStargazers:2Issues:0Issues:0

webarchive-fuse

Use FUSE-J to mount web archive files as filesystems.

Language:JavaStargazers:2Issues:0Issues:0
Language:JavaStargazers:2Issues:0Issues:0

file-archive-recordreader

File Archive RecordReader

Language:JavaStargazers:1Issues:0Issues:0

language-detection

Experimenting with https://code.google.com/p/language-detection/

Language:PHPStargazers:1Issues:0Issues:0

python-webhdfs

Python wrapper around Hadoop's WebHDFS interface.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:5Issues:0

ukwa.github.com

UK Web Archive GitHub Homepage

Language:CSSStargazers:1Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0

boards

Meta-project

Stargazers:0Issues:0Issues:0

bootstrap

HTML, CSS, and JS toolkit from Twitter

Language:CSSLicense:MITStargazers:0Issues:0Issues:0

docker-warcprox-squid

A squid setup suitable for scaling out warcprox

License:Apache-2.0Stargazers:0Issues:0Issues:0

jruby-whois

Ruby's Whois gem wrapped up as a Maven dependency for Java code, via JRuby.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

lucene-solr

Mirror of Apache Lucene & Solr

Language:JavaStargazers:0Issues:0Issues:0

monitrix-bdt

Monitrix: Block Detection Tool

Language:ScalaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

monitrix-imaqa

Monitrix: Image-based Quality Assurance module

Language:PythonStargazers:0Issues:0Issues:0

openwayback-access-control

web access control (exclusion oracle) tools for optional use with wayback machine

Language:JavaScriptLicense:Apache-2.0Stargazers:0Issues:0Issues:0

proto-col

Prototype Collections Browser

Language:HTMLStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PHPStargazers:0Issues:0Issues:0

warctools

warctools

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

webrender-har-daemon

Daemon intended to monitor a queue to which Heritrix will submit URLs. On receipt, the URL is submitted to a webservice (currently via django-phantomjs) and stores the response, a modified HAR record, in a WARC file.

Language:PythonStargazers:0Issues:0Issues:0