jasondavies / science.js

Scientific and statistical computing in JavaScript.

Home Page:http://www.jasondavies.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add an id to clusters created by hcluster

jdfekete opened this issue · comments

I think there is currently no way to relate the nodes/clusters created by the hcluster function to rows in the vector passed to the hcluster function once the clustering is computed.
Adding an attribute "id" or "index" to the clusters created would allow to link them back to their related vector indices.
It adds 3 lined to the code, initializing a variable id=0 and, e.g. line 59:
id: id++,
line 86:
id: id++,

leaf clusters would have an id < vectors.length, the others would be interior nodes.

I agree this could be made more intuitive, but you can currently retrieve the original vector objects via the centroid property of leaf nodes. So for a given node, you can traverse any children to until you reach the leaves to obtain the vector objects. Example code:

function traverse(node, vectors) {
  if (node.left || tree.right) {
    if (node.left) traverse(node.left, vectors);
    if (node.right) traverse(node.right, vectors);
  } else vectors.push(node.centroid);
}

Almost, but not completely. You have the value extracted at index "i" from the vectors parameter but not the index itself. You still have to chase for the index in the vectors argument then.