There are 2 repositories under mining-massive-datasets topic.
Data mining algorithms with Python
Analysis of Reddit Comments for Mining Massive Datasets at the Technical University of Munich
CS246: Mining Massive Data Sets Solutions
4 methodologies to find similar documents. Different methodologies can be used based on the case at hand. This repository can be used to find similar documents among billions of documents.
Introduction to Recommendation System week for Upschool
Spearheading the integration of extraterrestrial resources with Pi Network, ExoGenesis provides a platform for developing space-based mining algorithms, satellite-based infrastructure, and interplanetary communication protocols, opening new frontiers for decentralized networks.
Project that implements AMS streaming algorithm to compute the number of distinct items and to estimate kth moment given a twitter stream of data.
Python implementation of the Apriori, PCY, Multistage and Multihash algorithms