idc9 / ya_pca

Yet another PCA package. This one focuses on rank selection.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Yet another PCA package

A sklearn compatible python package for principal components analysis that includes several methods for PCA rank selection such as random matrix theory based thresholds, Wold style and bi-cross validation, Minka's method, Horn's Parallel Analysis, etc. See here for the list of currently supported rank selection methods as well as the corresponding references.

Installation

git clone https://github.com/idc9/pca.git
python setup.py install

Example

from pca.PCA import PCA
from pca.toy_data import rand_factor_model

# sample data from a factor model with 10 PCA components
X = rand_factor_model(n_samples=200, n_features=100,
                      rank=10, m=2, random_state=1)[0]

# fit PCA and select the rank by thresholding
# the singular values using the Marcenko Pastur distribution
pca = PCA(n_components='rmt_threshold',
          rank_sel_kws={'thresh_method': 'mpe'})
pca.fit(X)

Help and support

Additional documentation, examples and code revisions are coming soon. For questions, issues or feature requests please reach out to Iain: idc9@cornell.edu.

Contributing

We welcome contributions to make this a stronger package: data examples, bug fixes, spelling errors, new features, etc.

Citation

DOI

About

Yet another PCA package. This one focuses on rank selection.

License:MIT License


Languages

Language:Python 100.0%