There are 5 repositories under simhash topic.
Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang
Interesting (non-cryptographic) hashes implemented in pure Python.
Dynatrace hash library for Java
A fast python implementation of the SimHash algorithm.
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, Simhash and SimhashIndex
基于springboot和Google开源simhash算法实现的作业查重/抄袭检测/文本相似度分析可视化系统,,集成jplag、MOSS、singleCloud工具套件进行多方位查重 Ref: https://github.com/ALuShu/checksystem
Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.
A library for cosine similarity & simhash calculation
A rewrite of Bookmate's simhash gem, which is an implementation of Moses Charikar's simhashes in Ruby.
This system evaluates a collection of mementos (archived web pages) to determine which are off topic. The collection can be part of an Archive-It collection, a single TimeMap, or stored in a WARC file.
Code plagiarism system based on Simhash and Nicad.
⌨️ User Verification based on Keystroke Dynamics / Two-factor Authentication technology based on Key-Stroke
Rust jieba
Analysis of Massive Datasets FER labs