There are 0 repository under jaccard-distance topic.
A lightweight product recommendation system (Item Based Collaborative Filtering) developed in Haskell.
String Comparision in C#.NET
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Massive Sparse Data Clustering Based on Frequent Items (SIGMOD 2023)
A graph mining problem where the task was to predict a link between the given nodes. Engineered different features like Jaccard Distance, Cosine-Similarity, Shortest Path, Page Rank, Adar Index, HITS score and Kartz Centrality. Finally built non-linear models to get the final F1 score as 0.92.
Clustering similar tweets using K-means clustering algorithm and Jaccard distance metric
Tweets clustering K-means
Big Data Platform Final Project
Given a directed social graph, have to predict missing links to recommend users.
Sentence Similarity Approaches
String distances in rust
A string metric that measures proximity between 2 words. The metric calculation is a formula that utilizes 3 existing String metric algorithms: Jaccard Distance, Edit Distance and Longest Common Substring Distance.
This repository contains various assignments that I have done as a part of the Machine Learning course.
A bash script to filter paths based on the similarity of their subdirectory and file name components using Jaccard similarity.
Document Comparison web application based on Jaccard Similarity Index. The uploaded file is compared to all previously uploaded ones. Built with Java/JSP
By clustering similar tweets together, we can generate a more concise and organized representation of the raw tweets, which will be very useful for many Twitter-based applications (e.g., truth discovery, trend analysis, search ranking, etc.)
Classifying images into discrete categories based on keywords generated from the Google Cloud Vision API
How Far Would You Go for Italian Mozzarella? Exploring the impact of product cost and purchase frequency on distance traveled by Italian consumers
Asynchronous Distributed Actor-based Approach to Jaccard Similarity for Genome Comparisons
Set of tasks solved in Big Data Algorithms course
Knowledge extraction through Data Analysis, including Locality Sensitive Hashing (LSH).
This repository contains various classification, clustering and data analysis code.
In this assignment, you will learn how to cluster tweets by utilizing Jaccard Distance metric and K-means clustering algorithm.
Clustering Amazon review data around 6M users using Kmeans and Dbscan algorithm.
twitter tweets clustered by common ML algorithms
Link prediction - Who are my friends?
Compression algorithm based kernel perceptron using Jaccard's similitary
People do shopping to fulfill their needs. Regardless of the shopping styles, every shopping activity has a specific purpose or mission.
Function for calculating the Jaccard index and Jaccard distance for binary attributes
Created three different spelling recommenders, that each take a list of misspelled words and recommends a correctly spelled word for every word in the list. Each spelling recommender uses different Jaccard distance metrics. For every misspelled word, the recommender find the word in correct spellings that has the shortest distance, and starts with the same letter as the misspelled word, and return that word as a recommendation.
Clustering tweets using k-means algorithm.