Data-Mining

These are the projects I have done related to Data Mining

Curse of Dimensionality

This demonstrates through a curve how distance amongst data points becomes insignificant as the number of dimensions increases

Movie Recommendation System

This project has two implementations for a recommendation algorithm. The first one is a Naive Algorithm which averages the ratings for a movie not seen. The second algorithm "Recommendation.py" uses a model which factors in the age and gender to generate predictions using K Nearest Neighbours (K=100)

Decision Tree Classification System

This system builds a decision tree classifier for the given data set and predicts accuracy using 10 fold cross validation. The file names correspond to data sets from UC Irvine Machine learning repository. Performance is best for ordinal and nominal attributes, though continuous attributes give an acceptable accuracy. To use the scripts follow the pattern given below

e.g. cmd> python "DecisionTree.py" 'filePathForData' 'classColumnIndex #starts at 0' 'ImpurityMeasure #1 – GINI #2 – Information Gain'

pralhadsapre / Data-Mining

Data-Mining

About

Languages