luleg / MARGOT

MARGOT: Motif-based pARtitioning of Graphs with OrienTed edges

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Partitioning Directed Networks Based on their Motif Adjacency Matrices

This is a C++ software for partitioning directed networks based on their Motif Adjacency Matrices (MAM). Main steps of the software are illustrated below.

Namely,

  1. The MAM of the directed network is built using the buildMAM software. The doc GraphletIdentifiers.pdf lists all the motifs upon which a MAM can be built.
  2. This MAM then is partitioned via the Louvain algorithm.
  3. The disconnected nodes from the MAM are postprocessed by a homemade adaptation of Louvain.

🎬 No time for reading ? A recorded tutorial of this software is available on Youtube

A Word about the Postprocessing Step

The postprocessing is done by applying the Louvain algorithm on the directed network, symmetrised by forgetting edge directions, and in which nodes that belong to a same cluster have been merged into a unique meta-node. An additional constraint forbids the algorithm to put in a same cluster of the final partition two meta-nodes containing more than k nodes from the initial network.

Requirements

A bash shell and a g++ compiler are enough to compile and use the software.

Tested in an Ubuntu 18.04 environment emulated via a Windows Subsystem for Linux 1, with gcc version 7.5.0 as a compiler.

The present release contains the required files from third-party software, and can be used as a standalone.

Third-Party Software

Installation

On a bash command, at the root of the folder:

cd src/UtilsSNAP
make
cd ../Pipeline
make
cd ../..

Usage

To keep it short, the software can be used for six different tasks, and each of the six following commands in the root folder explains how to run one of these tasks.

./src/Pipeline/pipeline -ma -h     # Task: Build a MAM.
./src/Pipeline/pipeline -pa -h     # Task: Partition a network/MAM.
./src/Pipeline/pipeline -po -h     # Task: Postprocess disconnected nodes.
./src/Pipeline/pipeline -mapa -h   # Task: Build a MAM and partition it.
./src/Pipeline/pipeline -papo -h   # Task: Partition a MAM and postprocess disconnected nodes.
./src/Pipeline/pipeline -h         # Task: Build a MAM, partition it, postprocess disconnected nodes.

Detailed Usage

A number of arguments must/can be used for each task. The table below provides a summary of these arguments, with a brief description.

More precisely:

  • The flags -ma, -pa, -po, are used to indicate the kind of atomic tasks one wants to perform. They can be merged to perform non atomic tasks.

  • Input argument -igraph PathToDirectedGraph provides the path to the directed graph, that must be an edgelist with integer nodes, with only two columns, as shown on the right.

  • Input argument -isym PathToSymmetrisedGraph can be used instead of -igraph, e.g. when the symmetrised graph has been computed, or when working solely on a MAM.

  • Input argument -imam PathToMAM provides the path to an already computed MAM.

  • Input argument -ipart PathToPartition provides the path to an already computed partition.

  • Output argument -omam PathToMAM is the path to the file in which the computed MAM will be stored.

  • Output argument -opart PathToPartition is the path to the file in which the computed partition will be stored.

  • Output argument -oppart PathToPartition is the path to the file in which the partial partition (i.e. without postprocessing of disconnected nodes) will be stored, when both partial and full partitions are computed.

⚠️ For the output arguments, if the file already exists, its content is destroyed.

  • Parameter argument -m MotifIdentifier indicates the motif to use to build the MAM. See the doc GraphletIdentifiersWithOrbits.pdf for the list of all admissible motifs, along with their identifier.

💡 You don't have an idea of your graphlet identifier ? Just write it in an edgelist format and call graphlet_id.py:

echo -e "0 1\n1 2\n2 3\n3 0" > graphlet.txt   # 4-node loop graphlet
pyhon3 graphlet_id.py graphlet.txt            # Should return "Graphlet Identifier is Q4740"

⚠️ This requires python3 and the library NetworkX to be installed.

  • Parameter argument -orb anchors indicates the orbits to use as anchors to build the MAM. The doc GraphletIdentifiersWithOrbits.pdf lists the orbits for each admissible motifs.
  • Parameter argument -nth NumberOfThreads is the number of threads to use for building the MAM.

💡 To use when the network is large and the motif is a quadrangle. Number of threads should be between 4 and 8.

  • Parameter argument -l levelInLouvain is the Louvain hierarchical level used as the partition.

💡 Use levelInLouvain=1 for the finest hierarchical level, which has the larger number of communities, and levelInLouvain=-2 for the coarsest hierarchical level (i.e. with the smallest number of communities).

  • Parameter argument -c ResParamLouvain is the modularity resolution parameter to be used in Louvain.

💡 It is expected that the highest the resolution parameter, the smallest the communities. Classic values lie between 1 and 2.

  • Parameter argument -cc ResParamPostproc is the modularity resolution parameter to be used in the postprocessing step.

💡 The default value is low (1e-3) to avoid the creation of new clusters in the final partition.

  • Parameter argument -k SizeMergeableMetaNode is the maximum number of nodes that a meta-node can contain to be mergeable to another meta-node.

User-Case

Want to see how to apply this, concretely ? Have a look at the folders AnalysisOf*.

About

MARGOT: Motif-based pARtitioning of Graphs with OrienTed edges


Languages

Language:C++ 98.8%Language:C 0.8%Language:Python 0.3%Language:Shell 0.1%Language:Makefile 0.0%