Community Detection Algorithms
There are some datasets and algorithms for community detection
Requirements
Before compiling codes, the following software should be installed in your system.
- Matlab
- gcc (for Linux and Mac) or Microsoft Visual Studio (for Windows)
Datasets Information
Synthetic datasets
- SBM benchmark generated by stochastic block model
- GN benchmark generated by Girvan and Newman
- LFR benchmark (available at http://sites.google.com/site/santofortunato/inthepress2/)
Real datasets
- From http://snap.stanford.edu/data/index.html#communities you can download following datasets: Amazon, dblp, youtube and Email-Eu
- From https://linqs.soe.ucsc.edu/data you can download following datasets: citeseer, cora, WebKB, Pubmed-Diabetes, TerrorAttack and TerroristRel
- From http://math.bu.edu/people/kolaczyk/datasets.html you can download following datasets: AIDSBlog, Ecoli_microarray, epilepsy, packet_delay, ppi, PPI_function, router_INET, TM-ESTIMATION and zachary
- From http://cb.csail.mit.edu/cb/mna/isobase/ you can download following datasets: Isobase
- From http://socialcomputing.asu.edu/pages/datasets you can download following datasets: BlogCatalog and Flickr
- From http://www-personal.umich.edu/~mejn/netdata/ you can download following datasets: karate, lesmis, adjnoun, football, dolphins, polblogs, polbooks, celegansneural, power, cond-mat, cond-mat-2003, cond-mat-2005, astro-ph, hep-th, netscience and as-22july06
- From https://figshare.com/articles/American_College_Football_Network_Files/93179 you can download following dataset: footballTSE
Note that zachary and karate are the same datasets, the difference is that zachary dataset provides ground-truth while karate without ground-truth
Test dataset
- zachary dataset
- nodes: 34, edges: 78
- two communities with ground truth size >= 3
How to run baseline algorithms
Global community detection algorithms
$ cd Baseline_Algorithms/Global_Algorithms/Algorithms/
$ sh complile-all.sh
$ sh run.sh
$ cd processCode
$ matlab
$ getResults
Local community detection algorithms
$ cd Baseline_Algorithms/Local_Algorithms/Algorithms/
$ cd LEMON
$ matlab
$ LEMON % run LEMON algorithm
$ cd LOSP
$ matlab
LOSP % run LOSP algorithm
$ cd HK
$ matlab
$ mex -largeArrayDims hkgrow_mex.cpp % compile the mex file
$ HK % run HK algorithm
$ cd PR
$ matlab
$ mex -largeArrayDims pprgrow_mex.cc % compile the mex file
$ PR % run PR algorithm
$ cd PGDc_EMc
$ matlab
$ PGDc_EMc % run PGDc_EMc algorithm
$ cd YL
$ matlab
$ YL % run YL algorithm
$ cd MOV
$ matlab
$ MOV % run MOV algorithm
Announcements
Notification
Please email to panshi2016@gmail.com or setup an issue if you have any problems or find any bugs.
Acknowledgement
In the program, we incorporate some open source codes as baseline algorithms from the following websites: