jboynyc / textnets

Text analysis with networks.

Home Page:https://textnets.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alternative Community Algorithm Usage

BradKML opened this issue · comments

commented

Observing that this repo only uses Leidan, it would not be hard to propose the use of other Community Detection algorithms. One library comes to mind https://cdlib.readthedocs.io/
And for Bipartite communities, there are other libraries that are being integrated GiulioRossetti/cdlib#178 (comment)

Thanks, I wasn't aware of cdlib yet. I will document how to use other community detection algorithms with textnets in a new section of advanced topics I'm adding to the documentation.

I'm not sure what it would mean substantively to use something like Infomap, which makes some different assumptions about the underlying graph. Do you have any thoughts about that?

@jboynyc a major idea of the other algorithms in CDLibs are mainly there for speed-up concerns, but they are all based on some core ideas (e.g. random walks vs modularity vs node similarity). With Infomap and random walks, if two people have similar information flow then they should be in the same community, even if they are not necessarily modularity or partition sensitive (AKA being clustered together). Label Propagation is another class of algorithm that, instead of using walk length, uses network depth to spread labels around, similar to a zombie movie.

There are other algorithms that are specific to bipartite graphs (similar to topic model usage based on term and author), and directed graphs (similar to citation/reference/inspiration). Those are also worth testing and supporting for experimental purposes.

A-classification-of-community-detection-and-graph-clustering-methods-according-to-the

The advanced topics section of the documentation now has example code for cdlib and karateclub. Maybe I will add something for scikit-networks after I having a chance to try it out.

Thanks for bringing these projects to my attention!