- Notebook
- ResultsMetrics
- S-curves (for pairs)
- S-curves_triplets
- Time_Memory (experiments run on the server)
- Source code (python files and shell file run in the server)
DLSH.py MAIN source code for pairs LSH via MinHash similarity
TLSH.py MAIN source code for triplets LSH via MinHash similarity
WJS_JS.py code for Jaccard similarity and weighted Jaccard similarity
LSH_SIM_MH.py code computational time and memory for LSH and MinHash similarity for song pairs
runSparkMongo.sh sh file to run the python files on the server
README.txt detailed description for each python file
several notebooks with the plots of chapter 3