monkeyusage / duplicates

Removing duplicates in a smart efficient way

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

duplicates

Removing duplicates in a smart efficient way

FinalEDA, data quality exploration and duplicate cleaning can be found in the notebooks section.

Our duplicates detection function can be found in the scripts section.

To test the duplicate detection function, go to test and run pytest on test.py using "python -m pytest test.py"

About

Removing duplicates in a smart efficient way

License:MIT License


Languages

Language:Jupyter Notebook 99.3%Language:Python 0.5%Language:Shell 0.1%Language:Batchfile 0.1%