A modular framework for node classification in graphs with node attributes.
Vijayan, P., Chandak, Y., Khapra, M., M., Parthasarathy, S., and Ravindan, B. (2022) "Scaling Graph Propagation Kernels for Predictive Learning". Frontiers in Big Data, section Data Mining and Management. Paper
Vijayan, Priyesh, Yash Chandak, Mitesh M. Khapra, and Balaraman Ravindran. "Fusion Graph Convolutional Networks." arXiv preprint arXiv:1805.12528 (2018). This work was presented at the 14th workshop on Mining and Learning with Graphs (MLG), KDD'18.
This is a TensorFlow 1.3 implementation written in Python 3.5.
All required packages to run HOPF is provided in installations.sh
The Frameworks provides different kernels for performing Semi-Supervised learning for Node classification.
1> Graph Convolutional Networks (GCN) Kernel
2> Node Information Preserving Kernel
3> Fusion GCN Kernel
1> Binomial basis:
- Supports Node only and Neighbor only baselines
- Skip connection supported for Graph Convolutions
- Supported kernels: simple | kipf
- Fusion model available: binomial_fusion
2> Chebyshev basis:
- Skip connection internally turned Off (as it will change basis)
- Supported kernels: chebyshev (Default internally)
3> Krylov basis:
- Skip connection internally turned Off (as it will change basis)
- Supported kernels: simple | kipf (Kipf to be preferred)
You can specify the following model, dataset and training parameters. For choices of arguments and additional parameters refer 'parser.py'.
Model Parameters
1> propagation model with 'propModel' | 'propagation' is used in HOPF paper; other basis/models can be chosen
2> graph kernel with 'aggKernel' | 'simple' is used in NIP Kernel
3> number of hops with 'max_depth'
4> node features with 'node_features' | 'x' is used in NIP Kernel
5> neighbor features with 'neighbor_features' | 'h' is used in NIP Kernel
6> layer dimensions with 'dims'
7> skip Connections with 'skip_connectons'
8> shared node and neeighbor weights with 'shared_weights' | 0: no shared weights in NIP Kernel
9> number of HOPF iterations with 'max_outer' | 5: for I-NIP model
10> number of neighbors at each layer with 'neighbors'
...
Dataset details
1> dataset directory with 'dataset'
2> labeled percentage with 'percents'
3> folds to run with 'folds'
...
Datasets used in the paper with their train/test/val splits are available in https://bit.ly/2ZxTipS
Training details
1> drop learning rate with patience based stopping criteria with 'drop_lr'
2> weighted cross entropy loss with 'wce'
...
Usage:
cd HOPF/src/
export PYTHONPATH='../'
python __main__.py --dataset amazon --propModel binomial --aggKernel kipf
view script_cora.sh and run_cora.py to run multiple kernels in parallel across GPUs.
HOPF/
├── Datasets
│ ├── amazon
│ │ ├── adjmat.mat
│ │ ├── features.npy
│ │ ├── labels.npy
│ │ ├── labels_random
│ │ │ ├── 10
│ │ │ │ ├── 1
│ │ │ │ │ ├── test_ids.npy
│ │ │ │ │ ├── train_ids.npy
│ │ │ │ │ └── val_ids.npy
│ │ │ │ ├── ..
│ │ ├── 20
│ │ │ ├── ...
│ │ ...
│ ├── cora
│ └── ...
├── Experiments # Log and Outputs
│ └── 5|10|13:56:18 # Timestamp
│ └── cora # Dataset
│ └── simple # Kernel
│ └── Default
│ └── 10 # Labeled %
│ └── 2
│ ├── Checkpoints
│ ├── Embeddings
│ ├── Logs
│ └── Results
├── src # src code
│ ├── __main__.py # Main Train file
│ ├── cells
│ │ └── lstm.py
│ ├── config.py
│ ├── dataset.py
│ ├── layers
│ │ ├── batch_norm.py
│ │ ├── dense.py
│ │ ├── fusion_attention.py
│ │ ├── fusion_weighted_sum.py
│ │ ├── graph_convolutions
│ │ │ ├── chebyshev_kernel.py
│ │ │ ├── kernel.py
│ │ │ ├── kipf_kernel.py
│ │ │ ├── maxpool_kernel.py
│ │ │ └── simple_kernel.py
│ │ └── layer.py
│ ├── losses
│ │ └── laplacian_regularizer.py
│ ├── models
│ │ ├── binomial.py
│ │ ├── binomial_fusion.py
│ │ ├── chebyshev.py
│ │ ├── krylov.py
│ │ ├── krylov2.py
│ │ ├── model.py
│ │ ├── model_old.py
│ │ ├── propagation.py
│ │ └── propagation_fusion.py
│ ├── parser.py
│ ├── run.py
│ ├── run_cora.py
│ ├── script_cora.sh
│ ├── tabulate_results.py
│ └── utils
│ ├── inits.py # intitalizers
│ ├── metrics.py # metrics
│ ├── utils.py # numerous utilaries
Code Traversal
parser.py --- gets arguments
config.py --- loads arguments and sets up working environment
dataset.py --- takes in config and loads dataset
__main__.py --- takes in config and dataset objects
--- connects TF Queues with dataset objects
--- builds a model from 'models'
--- adds layers from 'layers'
--- adds ..
--- starts (mini) batch training the model
- A large portion of inital version of the code and some of the recent versions of this code was written in colloboration with Yash Chandak (https://yashchandak.github.io/).
Certain segments of codebase was forked and inspired from Thomas Kipf (https://github.com/tkipf/gcn/). - This work was partly supported by a grant from Intel Technology India Pvt. Ltd. to Balaraman Ravindran and Mitesh M. Khapra.
- I also thank my friend, Sandeep Mederametla who supported us with AWS credits.