kayzhu / LSHash

A fast Python implementation of locality sensitive hashing.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to query results based on threshold.

tom-riddle opened this issue · comments

Many LSH implementations use Jaccard similarity to return matching result above a certain threshold say 80% match.
Is possible to implement the same in this library.

commented

Same thoughts.
And I think zipping arrays into lower dimensions and using smaller input_dim may help, as smaller dimensions increases the probability of collision, and therefore similar vectors are more likely fall into a same slot.

commented

Same thoughts.
And I think zipping arrays into lower dimensions and using smaller input_dim may help, as smaller dimensions increases the probability of collision, and therefore similar vectors are more likely fall into a same slot.

Sorry, I mean hash_size..

Many LSH implementations use Jaccard similarity to return matching result above a certain threshold say 80% match. Is possible to implement the same in this library.

How to use threshold with this LSH implementation can you help me I have problem with this issue