helgeho / ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

helgeho/ArchiveSpark Stargazers