ClementCornet / Benchmark-Mixed-Clustering

Benchmarking clustering algorithms suited for mixed data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Benchmark-Mixed-Clustering

Benchmarking clustering algorithms suited for mixed data.

Algorithms implemented :
- K-Prototypes
- KAMILA
- Modha-Spangler
- FAMD-KMeans
- DenseClus
- ClustMD TO DO - Hierarchical clustering with Gower's Distance
- MixtComp
- KCMM TO ADD
- Pretopological Clustering (with FAMD, Laplacian Eigenmaps, UMAP and PaCMAP)

Benchmark over computation cost (memory usage, execution time) and internal validity indices (Calinski, Davies-Bouldin, Silhouette).

Use of real world data (see /data/ folder) and generated data.

About

Benchmarking clustering algorithms suited for mixed data


Languages

Language:Jupyter Notebook 60.7%Language:Python 38.4%Language:R 0.9%