critocrito / graphctl

Analyze networks for investigations.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Contributors Forks Stargazers Issues GPL-3.0 License


Logo

graphctl

Investigate complex networks.
Explore the docs »

Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. License
  6. Contact

About The Project

This code was developed to help with network analyses for various investigations that took place at Der SPIEGEL and Paper Trail Media.

(back to top)

Getting Started

Prerequisites

The project uses Poetry to manage python dependencies. To install see the Poetry documentation for all options. To use the quick installer provided by Poetry run:

curl -sSL https://install.python-poetry.org | python3 -

Installation

  1. Clone the repo
    git clone https://github.com/critocrito/graphctl.git
  2. Install the python dependencies
    poetry install
  3. Make sure the command runs
    poetry run graphctl --help

(back to top)

Usage

The graphctl command takes a CSV as input and outputs again CSV files with the results of the computations. The input CSV needs to have a source field and a target field, describing the nodes and their connections. This would be an example for a network.csv.

source,target
nodeA,nodeB
nodeB,nodeC
nodeA,nodeC
nodeC,nodeD
...

Every command takes a -g/--graph option which selects either a directed or a undirected graph. It defaults to a undirected graph.

poetry run graphctl -g directed <computation>

Here is a list of all possible outputs that can be generated from the above network.

All

Compute all insights about a network in one go. This includes most of the below computations.

poetry run graphctl all network.csv out_dir

Topology

  • Basic

    Compute a set of basic topological attributes to give a quick overview over the network.

    poetry run graphctl topology basic network.csv topology.csv

Centrality

  • Degree Centrality

    Degree centrality assigns an importance score based simply on the number of links held by each node. In this analysis, that means that the higher the degree centrality of a node is, the more edges are connected to the particular node and thus the more neighbor nodes (communication partners) this node has. In fact, the degree of centrality of a node is the fraction of nodes it is connected to. In other words, it is the percentage of the network that the particular node is connected to meaning having communicated with.

    poetry run graphctl centrality degree network.csv degree-centrality.csv
  • Betweeneess Centrality

    Betweenness centrality measures the number of times a node lies on the shortest path between other nodes, meaning it acts as a bridge. In detail, betweenness centrality of a node is the percentage of all the shortest paths of any two nodes (apart from ), which pass through . Specifically, this measure is associated with the user’s ability to influence others. A user with a high betweenness centrality acts as a bridge to many users that are not friends and thus has the ability to influence them by conveying information (e.g. by posting something or sharing a post) or even connect them via the user’s circle (which would reduce the user’s betweeness centrality after).

    poetry run graphctl centrality betweenness network.csv betweenness-centrality.csv
  • Closeness Centrality

    Closeness centrality scores each node based on their ‘closeness’ to all other nodes in the network. For a node , its closeness centrality measures the average farness to all other nodes. In other words, the higher the closeness centrality of , the closer it is located to the center of the network.

    poetry run graphctl centrality closeness network.csv closeness-centrality.csv
  • Eigenvector Centrality

    Eigenvector centrality is the metric to show how connected a node is to other important nodes in the network. It measures a node’s influence based on how well it is connected inside the network and how many links its connections have and so on. This measure can identify the nodes with the most influence over the whole network. A high eigenvector centrality means that the node is connected to other nodes who themselves have high eigenvector centralities. The measure is associated with the users ability to influence the whole graph and thus the users with the highest eigenvector centralities are the most important nodes in this network.

    poetry run graphctl centrality eigenvector network.csv eigenvector-centrality.csv

Communities

  • K-clique

    poetry run graphctl community k-clique network.csv k-clique-communities.csv
  • Louvain

    poetry run graphctl community louvain network.csv louvain-communities.csv
  • Label Propagation

    poetry run graphctl community label-propagation network.csv label-propagation-communities.csv

Plot

  • Graph

    Render the whole network graph.

    poetry run graphctl plot graph --iterations 15 network.csv graph.png
  • Bridges

    Render the graph and mark the bridges in the graph.

    poetry run graphctl plot bridges network.csv bridges.png
  • Degree Centrality Distribution

    Plot the distribution of degree centrality as a bar chart.

    poetry run graphctl plot degree-centrality-distribution network.csv degree-centrality.png
  • Betweenness Centrality Distribution

    Plot the distribution of betweenness centrality as a bar chart.

    poetry run graphctl plot betweenness-centrality-distribution network.csv betweenness-centrality.png
  • Eigenvector Centrality Distribution

    Plot the distribution of eigenvector centrality as a bar chart.

    poetry run graphctl plot eigenvector-centrality-distribution network.csv eigenvector-centrality.png
  • Closeness Centrality Distribution

    Plot the distribution of the closeness centrality as a bar chart.

    poetry run graphctl plot closeness-centrality-distribution network.csv closeness-centrality.png
  • Label Propagation Community

    Plot the graph with the label propagation computed communities.

    poetry run graphctl plot label-propagation-community network.csv label-propagation.png

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the GPL-3.0 License. See LICENSE.txt for more information.

(back to top)

Contact

Christo Buschek - @christo_buschek - christo.buschek@proton.me

Project Link: https://github.com/critocrito/graphctl

(back to top)

About

Analyze networks for investigations.

License:GNU General Public License v3.0


Languages

Language:Python 100.0%