trying shingling / resemblance / simhash / sketching to do some data deduping
Home Page:http://matpalm.com/resemblance/
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool