1e0ng / simhash

A Python Implementation of Simhash Algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to suport big bucket calculating fast?

sunboy123 opened this issue · comments

i use more than 10000 pieces texts , use it genenerate features, elapsed time 8 miniutes。 theoretically,it too long. how can i solve this problem?

commented

Hi, generating features is out of the scope of this library, because it only calculates simhash based on the existing features.