noiano / ARCInputFormat

Packages the ARCInputFormat used in Common Crawl in a small jar file that can be used in MapReduce jobs. Implements HdfsARCSource. See README for details

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

noiano/ARCInputFormat Stargazers