Talk given on Sept. 8, 2018 to the KC R Users Group.
The talk is an overview of several feature selection methods, including:
- Remove Highly Correlated Variables
- Run OLS and select significant features
- Caret’s Recursive Feature Extraction (RFE)
- Feature Importance
- glmnet
- Boruta “All Relevant” Variables
- Singular Value Decomposition (SVD)
- Principal Component Analysis (PCA)
The Forensic Glass dataset from the MASS package is used in most of the examples.
Since the use of PCs as predictors was introduced as a topic, the last few slides show visual exploratory analysis of PCs in a 3D scatterplot, both interactively and with an animated GIF file.
R Markdown and corresponding HTML files:
Forensic-Glass-FILE.Rmd and Forensic-Glass-FILE.html
FILE
Boruta
Correlation
PCA
SVD
Forensic-Glass-caret-FILE.Rmd and Forensic-Glass-caret-FILE.html
FILE
glmnet
RFE
Some additional files mentioned can be found in a talk given last year: Using R's Caret Package for Machine Learning.