tbonald / hierarchy_metrics

Metric for hierarchical graph clustering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hierarchy_metrics

This repository contains the reference implementation in Python of the relative entropy, a metric for assessing the quality of the hierachical clustering of a graph, as described in:

Learning Graph Representations by Dendrograms, 2018

Dependency

The implementation depends on the networkx package, which can be installed using pip.

sudo pip install networkx

Getting started

from hierarchy_metrics import *

Hierarchy of the Karate Club graph using Newman's algorithm:

graph = nx.karate_club_graph()
dendrogram = hierarchical_clustering(graph, algorithm = "newman")

Metrics:

print("Quality:", relative_entropy(graph, dendrogram))
print("Cost:", dasgupta_cost(graph, dendrogram))

Quality: 1.3526175451991203
Cost: 0.36143984220907294

Experiments

Experiments on real and synthetic data are available as a Jupyter notebook:

experiments.ipynb

License

Released under the 3-clause BSD license.

About

Metric for hierarchical graph clustering

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 53.4%Language:Jupyter Notebook 46.6%