atmguille / archetypal-analysis

Implementation of Archetypal Analysis algorithms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Archetypal Analysis

This repository contains an implementation of three main algorithms to compute archetypes:

I developed this code as part of my Mathematics Undergraduate Thesis on Archetypal Analysis at UAM. Find the original Thesis in Spanish here, and an English translation here.

Python implementation

In the Python implementation directory, one can find the implementation of the three algorithms in Python. Moreover, time_comparison.py is a script that compares the performance of the three of them.

R implementation

Out of the three proposals, the first two were already implemented in R. One can install them by running the following commands:

# Original implementation
install.packages("archetypes")
library("archetypes")
# PCHA implementation
install_version("archetypal", version = "1.1.1", repos = "http://cran.us.r-project.org", dependencies=T)
library("archetypal")

The adaptation of the Frank-Wolfe algorithm is implemented in the R implementation directory.

Archetypal plot

One of the main features of archetypal analysis is that they are interpretable. Taking advantage of this, we have implemented a function to visualize the distribution of weights of a sample for a set of archetypes. This functionality is available in archetypal_plot.py and produces Figures like the next one:

Archetypal plot

Performance comparison

Although further details are provided in my Undergraduate Thesis, the following Figure summarizes the performance comparison of the three algorithms (in Python).

Performance comparison

Comparing archetypal analysis with other unsupervised learning algorithms

In order to demonstrate the advantages of archetypal analysis over other unsupervised methods (PCA, k-means), they have been compared in two examples. Code is available in Kaggle visiting the following links:

About

Implementation of Archetypal Analysis algorithms

License:Apache License 2.0


Languages

Language:Python 89.1%Language:R 10.9%