BinaryBlob / CCMTM

Scripts to extract useful data from the Common Crawl data set. Created on the Machine Translation Marathon 2013

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CCMTM

Scripts to extract useful data from the Common Crawl data set. Created on the Machine Translation Marathon 2013

About

Scripts to extract useful data from the Common Crawl data set. Created on the Machine Translation Marathon 2013

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Java 97.7%Language:Shell 2.3%