There are 0 repository under lsh-algorithm topic.
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation
Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度
一个基于 fasttext + faiss 的商品内容相关推荐实现,nginx+uwsgi+flask / gunicorn+uvicorn+fastapi 提供api查询接口,增加Spark实现 Ansj+Word2vec+LSH+Phoenix
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
A Query Efficient Natural Language Attack in a Black Box Setting
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Search your object with hash
A semantic search indexing system designed to efficiently retrieve top matching results from a database of 20 million documents. Given the embedding of a search query, it quickly identifies and returns the most relevant documents
使用线程池的高并发 LSH 算法, C++ 实现
An implementation of LSH Forrest based off of the following paper (http://infolab.stanford.edu/~bawa/Pub/similarity.pdf).
Build content-based image retrieval system using deep learning, applied some large scale similarity search technicals like Kdtree, LSH, Faiss.
A Robust Library in C# for Similarity Estimation
Recommendation System on cryptocurrency, using data collected from users' tweets + 10-Fold Cross Validation ( Based on the cryptocoins from each user's tweets, the program runs algorithms on the data, resulting in the recommendation of other cryptocoins for each user) ( readme in greek but soon to be translated in English )
Lab assignments for the course ID2222-Data Mining at KTH
Approximate UniFrac via Weighted MinHash 🦀
Implementation of algorithms for big data using python, numpy, pandas.
Repository for all assignments of the course COL761: Data Mining (Fall 2020), taught at IIT Delhi
Scalable mining of multidimensional time series motifs.
A Python project implementing shingling, minwise hashing, and locality-sensitive hashing (LSH) for text similarity detection, along with feature engineering and clustering analysis on real-world datasets. Includes code, visualizations, and key insights for efficient data processing and analysis.
MDLE First Assignment - The objective of this project was to implement the A-Priori algorithm to obtain the most frequent itemsets for a list of conditions for a large set of patients, obtaining then associations between conditions by extracting some rules, and also to implement and apply LSH to identify similar news articles from a dataset.
Nilsimsa implementation as a swift package
LSH algorithm made with C++
Scaling Up Nearest Neighbor Search : How Dataset Size and Dimensionality Affect KNN Variants
This repo aims to implement an modular engine for Locality-Sensitive Hashing (LSH).
Vectors - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with L2 metric.
Coursera's Natural Language Processing specialization
Implementation of LSH in order to find the similarity in a large dataset
Fast Sketching for Weighted Sets
The assignment comprises two main tasks: implementing LSH to identify similar businesses based on user ratings and developing various collaborative filtering recommendation systems to predict user ratings for businesses.
Implementacija algoritama predstavljenih na predmetu Analiza velikih skupova podataka (AVSP)
Software Development for Algorithmic Problems (UoA) Assignments
Euclidean Minimum Spanning Tree approximation with a parameterless LSH index
Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report