Table of Contents
This repository contains 4 different labs about data mining.Each lab introduces new concepts on the subject.
In order to run these labs and extend it you need to follow some few steps :
- Make sure that Weka is installed on your operating system.
- Make sure that Python is installed on your operating system. (download from
https://www.python.org/downloads/
) (Jupyter is more of a feature)
Clone the repo
git clone https://github.com/rihemebh/Data-mining-Labs.git
It represents an introduction to Weka (Waikato environment for knowledge analysis).You will be able to :
- Discover some datasets(including the famous iris Dataset).
- Create classifiers(Decision Tree).
- Visualize and interpret data.
- Use features filters
It represents an introduction to Weka experimenter interface.You will be able to :
- Generate CSV files containing experiment details.
- interpret the different test results.
- Compare different algorithms using the weka analyzer.
It represents an introduction to the scikit-learn Python library.You will be able to :
- Read and manipulate Datasets (Iris Dataset).
- Create and use a classifier with different algorithms (Naïve Bayes, Decision Trees).
- Evaluate classifier performances (Calculating errors).
- Use cross-validation to evaluate the classifier.
It represents an introduction to the concept of Unservised Learning.You will be able to :
- Read and manipulate Datasets.
- Using kmeans for the clustering.
- Learning the silhouette coefficient utility.
- Using Agglomerative Hierarchical Clustering (CAH) and generating the Dendrogram.
- Using Principal component analysis (PCA).
- Implementing the Dvisive ANAlysing (DIANA) algorithm.