The workshop is based on scikit-learn library.
- Anaconda
- plotly
The contents of the workshop:
- Pre-processing & Feature Extraction
- Pre-processing and visualisation
- Feature Selection
- Feature Extraction
- Classification
- Decision Trees and Random Forests
- Support Vector Machines
- Naïve Bayesian Classifier
- K-Nearest Neighber
- Logistic Regression
- Regression
- Generalized Linear Models
- Ridge Regression (Regularization)
- Bayesian Regression
- Case study 1
- Student Performance Regression & Classification study case
- Clustering
- Connectivity-based clustering (Hierarchical clustering)
- Centroid-based clustering (K-means clustering)
- Distribution-based clustering (Expectation-Maximization EM clustering)
- Density-based clustering (DBSCAN)
- Dimensionality Reduction
- Principal Component Analysis
- Feature agglomeration
- Model Selection and Evaluation
- Cross-validation: evaluating estimator performance
- Tuning the hyper-parameters of an estimator
- Model evaluation: quantifying the quality of predictions
- Model Persistence
- Validation curves: plotting scores to evaluate models
- Case study 2
- Individual household electric power consumption