professorwug / diffusion-curvature

Fast, noise-resistant curvature for graphs and pointclouds

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Diffusion Curvature

[!INFO] This code is currently in early beta. Some features, particularly those relating to dimension estimation and the construction of comparison spaces, are experimental and will likely change. Please report any issues you encounter to the Github Issues page.

Diffusion curvature is a pointwise extension of Ollivier-Ricci curvature, designed specifically for the often messy world of pointcloud data. Its advantages include:

  1. Unaffected by density fluctuations in data: it inherits the diffusion operator’s denoising properties.
  2. Fast, and scalable to millions of points: it depends only on matrix powering - no optimal transport required.

Install

To install with pip (or better yet, poetry),

pip install diffusion-curvature

or

poetry add diffusion-curvature

Conda releases are pending.

Usage

To compute diffusion curvature, first create a graphtools graph with your data. Graphtools offers extensive support for different kernel types (if creating from a pointcloud), and can also work with graphs in the PyGSP format. We recommend using anistropy=1, and verifying that the supplied knn value encompasses a reasonable portion of the graph.

Graphtools offers many additional options. For large graphs, you can speed up the powering of the diffusion matrix with landmarking: simply pass n_landmarks=1000 (e.g) when creating the graphtools graph. If you enable landmarking, diffusion-curvature will automatically use it.

Next, instantiate a DiffusionCurvature operator.

And, finally, pass your graph through it. The DiffusionCurvature operator will store everything it computes – the powered diffusion matrix, the estimated manifold distances, and the curvatures – as attributes of your graph. To get the curvatures, you can run G.ks.

# G_torus = DC.curvature(G_torus, dimension=2) # note: this is the intrinsic dimension of the data
plot_3d(X_torus, G_torus.ks, colorbar=True, title="Diffusion Curvature on the torus")

Using on a predefined graph

If you have an adjacency matrix but no pointcloud, diffusion curvature may still be useful. The caveat, currently, is that our intrinsic dimension estimation doesn’t yet support graphs, so you’ll have to compute & provide the dimension yourself – if you want a signed curvature value.

If you’re only comparing relative magnitudes of curvature, you can skip this step.

For predefined graphs, we use our own ManifoldGraph class. You can create one straight from an adjacency matrix:

Alternately, to compute just the relative magnitudes of the pointwise curvatures (without signs), we can directly use either the wasserstein_spread_of_diffusion (which computes the $W_1$ distance from a dirac to its t-step diffusion), or the entropy_of_diffusion function (which computes the entropy of each t-step diffusion). The latter is nice when the manifold’s geodesic distances are hard to estimate – it corresponds to replacing the wasserstein distance with the KL divergence.

Both of these estimate an “inverse laziness” value that is inversely proportional to curvature. To use magnitude estimations in which the higher the curvature, the higher the value, we can simply take the reciprocal of the output.

About

Fast, noise-resistant curvature for graphs and pointclouds


Languages

Language:Jupyter Notebook 97.0%Language:HTML 1.3%Language:TeX 0.9%Language:Python 0.7%Language:JavaScript 0.0%Language:Lua 0.0%Language:SCSS 0.0%Language:CSS 0.0%Language:Shell 0.0%