PCA - Principal Component Analysis

PCA - Principal Component Analysis | Example code and own notes while taking the course "Intro to Machine Learning" on Udacity

How to determine the principle component?

Variance: Technical term in statistics - roughly the "spread" of a data distribution (similar to standard daviation).

Example:

PCA as a General Algorithm for Feature Information

Eigen vector & value

Review / Definition of PCA

Systemized way to transform input features into principal components.
Use principal components as new features.
PCs are directions in data that maximize variance (minimize information loss) when you project/compress down onto them.
More variance of data along a PC, higher that PC is ranked.
Most variance/most information -> First PC
Second most variance (without overlapping w/first PC) -> Second PC
Max no. of PCs = no. of input features.

When to use PC

Latent features driving the patterns in data.
Dimensionality reduction
- Visualizing high-dimensional data
- Reduce noise
- Make other algorithmg (regression, classification)

F1 score

Accuracy is a less-intuitive metric than in the 2-class case. Instead, a popular metric is the F1 score.

Higher F1 score is better.

Selecting a number of principal components

Train on different number of PCs, and see how accuracy responds, cut off when it becomes apparent that adding more PCs doesn't buy you much more discrimination.

About

PCA - Principal Component Analysis | Example code and own notes while taking the course "Intro to Machine Learning" on Udacity