Packages the ARCInputFormat used in Common Crawl in a small jar file that can be used in MapReduce jobs. Implements HdfsARCSource. See README for details
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool