bmschmidt / pubmed-explorer

Scrollership through 20m pubmed abstracts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regional Clustering

bmschmidt opened this issue · comments

Probably needs to be postponed after launch, but I spent a little time poking around at building a Delaunay triangulation and minimal spanning tree of 1,000,000 points from the dataset and then walking it from some random seeds to build a set of clusters. It works reasonably well, although could use a step where nearby clusters (i.e., those sharing many short edges on the delaunay triangulation) are agglomerated to each other.

One use here would be to generate labels/other characteristics for lobes.

image