There are 16 repositories under document-similarity topic.
Compute Sentence Embeddings Fast!
Web Application for checking the similarity between query and document using the concept of Cosine Similarity.
Document similarity algorithms experiment - Jaccard, TF-IDF, Doc2vec, USE, and BERT.
Document Search Engine Tool
A Clojure library for querying large data-sets on similarity
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Compilation of Natural Language Processing (NLP) codes. BONUS: Link to Information Retrieval (IR) codes compilation. (checkout the readme)
A simple Django-based resume ranker website where recruiters post their jobs and candidates applies for their desired vacancies. The system gets the document similarity between the job description and the candidate resumes, generates similarity scores using the KNN model, and rank or shortlist the candidate resumes.
A tool which can find your any document using semantic search
Document Similarity with Apache Spark using Locality Sesitive Hashing and Python
Using Jaccard-Similarity and Minhashing to determine similarity between two text documents
Survey data and Python code for the ICADL 2021 paper "A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles"
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
Aims to provide job searching strategy for new graduates who are interested in data-related positions.
The Bitnation Jurisdiction Public Notary DApp
Compare sentences from input document with all sentences from reference documents - find very similar ones.
Document searching from queries using Inverted index
Simple document similarity module implemented in NodeJS
DocxMatch is a Streamlit app that analyzes the similarity between Word files.
A Two-ended Hiring web application built using flask. The application uses document similarity techniques for recommendation.
Code to train a LSI model using Pubmed OA medical documents and to use pre-trained Pubmed models on your own corpus for document similarity.
A system for automatic tagging of metadata of theses and dissertations from Bicol University
Topic Modeling in Cython
Individual group project in Python
This repository will demonstrate how to explore spiritual world using NLP techniques like, sentiment analysis, topic modeling, information retrieval and text summarization.
A simple MinHash implementation based on the explanation in the Mining of Massive Datasets course by Stanford
Classifying products into categories using NLP techniques
This is a program used to check document similarity using Natural Language Tool Kit,using Cosine Similarity.