jdc08161063 / dgi

TensorFlow implementation of Deep Graph Infomax

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep Graph Infomax

Deep Graph Infomax (DGI) is an unsupervised algorithm for finding representations of graphs that can be used in downstream tasks like node classification.

This is a TensorFlow implementation of DGI, based on the Graph Convolutional Network implementation by Thomas Kipf.

Installation

python setup.py install

Requirements

  • tensorflow (>0.12)
  • networkx

Run

First train a DGI model:

python train.py --model dgi

Once the model is trained, the graph embeddings are saved as a pickle file in the runs folder. Take note of its path (e.g. runs/2018-11-04-164053/embeddings.p and use it to train a logistic regression model on the node classification task:

python train.py --model logreg --embeddings_path runs/2018-11-04-164053/embeddings.p

Data

In order to use your own data, you have to provide

  • an N by N adjacency matrix (N is the number of nodes),
  • an N by D feature matrix (D is the number of features per node), and
  • an N by E binary label matrix (E is the number of classes).

Have a look at the load_data() function in utils.py for an example.

In this example, we load citation network data (Cora, Citeseer or Pubmed). The original datasets can be found here: http://linqs.cs.umd.edu/projects/projects/lbc/. In our version (see data folder) we use dataset splits provided by https://github.com/kimiyoung/planetoid (Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov, Revisiting Semi-Supervised Learning with Graph Embeddings, ICML 2016).

You can specify a dataset as follows:

python train.py --dataset citeseer

(or by editing train.py)

Models

You can choose between the following models:

About

TensorFlow implementation of Deep Graph Infomax

License:MIT License


Languages

Language:Python 100.0%