RAA

Implementation of Relational Archetypal Analysis (RAA). RAA is built upon the Archetypal Analysis (AA) method proposed by Cutler and Breiman. AA is a unsupervised clustering algorithm that learns the extremes of the data, called archetypes. From these archetypes all data points can be expressed as convex combinations. Our contribution is the extension of AA to graph data.

This implementation makes use of sampling from the sparse representation of the adjacency matrix to improve training time and scalability. Therefore, RAA scales to large scale networks. Addtionally have this implementation been developed for unipartite undirected graphs as well as bipartite graphs. However, it is assumed that the graph consist of a single giant connected component.

Prerequisites

pip install requirements.txt

The implementation uses the Pytorch sparse package. Installation details can be found on Pytorch geometric's webpage.

The implementation have both CPU and CUDA capabilities.

Original synthetic latent space	Reconstruction of latent space

Latent embedding space	Archetypal maximum membership ordering of adjacency matrix

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Project based on the cookiecutter data science project template. #cookiecutterdatascience

ChristianDjurhuus / RAA

RAA

Prerequisites

Project Organization

About

Languages