There are 3 repositories under lsh topic.
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)
Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
A scalable nearest neighbor search library in Apache Spark
A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It implements Locality-sensitive Hashing (LSH) and multi index hashing for hamming space.
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Weighted MinHash implementation on CUDA (multi-gpu).
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Near-duplicate image detection using Locality Sensitive Hashing
A Clojure library for querying large data-sets on similarity
LSH index for approximate set containment search
Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度
SLIDE (Sub-LInear Deep learning Engine) written in Go
Query-Aware LSH for Approximate NNS (PVLDB 2015 and VLDBJ 2017)
Accurate and Fast ALSH for Maximum Inner Product Search (KDD 2018)
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Easy-to-use Java similarity algorithms for text and numeric-series
Serverless, lightweight, and fast vector database on top of DynamoDB
A framework for index based similarity search.
Locality Sensitive Hashing for semantic similarity (Python 3.x)
A backup suite. Supports FLZMA2, bzip3, LZ4, Zstandard, LSH i-node ordering deduplicating archiver, long range deduplication, encryption and recovery records
Query-Aware LSH for Approximate NNS (In-Memory Version of QALSH)
Locality-sensitive hashing (LSH) in Julia.
LSH Scheme based on Longest Circular Co-Substring (SIGMOD 2020)