# yejg2017 / The-Elements-of-Statistical-Learning-Python-Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

Geek Repo

Github PK Tool

# "The Elements of Statistical Learning" Notebooks

Reproducing examples from the "The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani and Jerome Friedman with Python and its popular libraries: numpy, math, scipy, sklearn, pandas, tensorflow, statsmodels, sympy, catboost, pyearth, mlxtend. Almost all plotting is done using matplotlib, sometimes using seaborn.

## Examples

The documented Jupyter Notebooks are in the examples folder:

### examples/Mixture.ipynb

Classifying the points from a mixture of "gaussians" using linear regression, nearest-neighbor, logistic regression with natural cubic splines basis expansion, neural networks, support vector machines, flexible discriminant analysis over MARS regression, mixture discriminant analysis, k-Means clustering, Gaussian mixture model and random forests. ### examples/Prostate Cancer.ipynb

Predicting prostate specific antigen using ordinary least squares, ridge/lasso regularized linear regression, principal components regression, partial least squares and best subset regression. Model parameters are selected by K-folds cross-validation. ### examples/South African Heart Disease.ipynb

Understanding the risk factors using logistic regression, L1 regularized logistic regression, natural cubic splines basis expansion for nonlinearities, thin-plate spline for mutual dependency, local logistic regression, kernel density estimation and gaussian mixture models. ### examples/Vowel.ipynb

Vowel speech recognition using regression of an indicator matrix, linear/quadratic/regularized/reduced-rank discriminant analysis and logistic regression. ### examples/Bone Mineral Density.ipynb

Comparing patterns of bone mineral density relative change for men and women using smoothing splines. ### examples/Phoneme Recognition.ipynb

Phonemes speech recognition using reduced flexibility logistic regression. ### examples/Galaxy.ipynb

Analysing radial velocity of galaxy NGC7531 using local regression in multidimentional space. ### examples/Ozone.ipynb

Analysing the factors influencing ozone concentration using local regression and trellis plot. ### examples/Spam.ipynb

Detecting email spam using logistic regression, generalized additive logistic model, decision tree, multivariate adaptive regression splines, boosting and random forest. ### examples/California Housing.ipynb

Analysing the factors influencing California houses prices using boosting over decision trees and partial dependance plots. ### examples/Demographics.ipynb

Predicting shopping mall customers occupation, and hence identifying demographic variables that discriminate between different occupational categories using boosting and market basket analysis. ### examples/ZIP Code.ipynb

Recognizing small hand-drawn digits using LeCun's Net-1 - Net-5 neural networks. Analysing of the number three variation in ZIP codes using principal component and archetypal analysis. ### examples/Human Tumor Microarray Data.ipynb

Analysing microarray data using K-means clustring and hierarchical clustering. ### examples/Country Dissimilarities.ipynb

Analysing country dissimilarities using K-medoids clustering and multidimensional scaling. ### examples/Signature.ipynb

Analysing signature shapes using Procrustes transformation. ### examples/Waveform.ipynb

Recognizing wave classes using linear, quadratic, flexible (over MARS regression), mixture discriminant analysis and decision trees. ### examples/SRBCT Microarray.ipynb

Analyze microarray data of 2308 genes and select the most significant genes for cancer classification using nearest shrunken centroids. 