Machine Learning in Python Workshop

The workshop is based on scikit-learn library.

Installation

The contents of the workshop:

Pre-processing & Feature Extraction
- Pre-processing and visualisation
- Feature Selection
- Feature Extraction
Classification
- Decision Trees and Random Forests
- Support Vector Machines
- Naïve Bayesian Classifier
- K-Nearest Neighber
- Logistic Regression
Regression
- Generalized Linear Models
- Ridge Regression (Regularization)
- Bayesian Regression
Case study 1
- Student Performance Regression & Classification study case
Clustering
- Connectivity-based clustering (Hierarchical clustering)
- Centroid-based clustering (K-means clustering)
- Distribution-based clustering (Expectation-Maximization EM clustering)
- Density-based clustering (DBSCAN)
Dimensionality Reduction
- Principal Component Analysis
- Feature agglomeration
Model Selection and Evaluation
- Cross-validation: evaluating estimator performance
- Tuning the hyper-parameters of an estimator
- Model evaluation: quantifying the quality of predictions
- Model Persistence
- Validation curves: plotting scores to evaluate models
Case study 2
- Individual household electric power consumption