seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What's the meaning of index list returned?

wenshuaizhao opened this issue · comments

Hi, thank you for your great contribution first!
But I wonder about the index list the function 'connected_components(data)' returns, why is the index list not continuous, such as[0,1,2,3], but in fact it returns list like [0,210,213,220].

Another, I want to know for the multi_label case, such as [0,1,2], then it returns connected components labels like [0,210,213,220]. Then how can we tell the different connected components belonging to original label?

Last, does this repository has any attribute to find if the connected components neighbored?

Hi Wenshuai!

It's possible to recover the labels using numpy.unique. The reason the ids are not contiguous is that the ids are assembled via merges in the union find algorithm. To get contiguous ids I'd have to renumber the array.

I didn't do this because I was stripping everything out to be as fast as possible, but it's probably a false economy as performing the renumber step is likely faster than calling np.unique.... however the fast way of doing it might use excessive memory, but I think I see a way of doing it with less on average.

I like your idea of assembling a region graph from the image, my lab might find that useful too. I'll try googling around first but if there's nothing convenient out there I'm happy to accept pull requests or write it myself, but I'm pretty busy so it's hard to say when it will be available.

@wenshuaizhao I just released version 1.2.0 which renumbers the array. There should be no performance penalty in memory or significant runtime from this change. You can now use np.max(cc_labels) to get the number of labels in the array, which is significantly faster and lower memory than np.unique.

It would be possible to return the max with the function and e.g. downsize the output datatype, but it requires me to change my C++ function to do multiple return and make a backwards incompatible change to cc3d's return value. I might do it in the future with a major version increment.

Thank you for your fast reply! I think I have made sense of such index. It is really great.