`Sparse denoising diffusion for large graph generation`

Official code for the paper, "Sparse Training of Discrete Diffusion Models for Graph Generation," available here.

Checkpoints to reproduce the results can be found at this link. Please refer to the updated version of our paper on arXiv.

Environment installation

This code was tested with PyTorch 2.0.1, cuda 11.8 and torch_geometrics 2.3.1

Download anaconda/miniconda if needed
Create a rdkit environment that directly contains rdkit:

conda create -c conda-forge -n sparse rdkit=2023.03.2 python=3.9
conda activate sparse
Check that this line does not return an error:

python3 -c 'from rdkit import Chem'
Install graph-tool (https://graph-tool.skewed.de/):

conda install -c conda-forge graph-tool=2.45
Check that this line does not return an error:

python3 -c 'import graph_tool as gt'
Install the nvcc drivers for your cuda version. For example:

conda install -c "nvidia/label/cuda-11.8.0" cuda
Install a corresponding version of pytorch, for example:

pip3 install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118
Install other packages using the requirement file:

pip install -r requirements.txt
Install mini-moses:

pip install git+https://github.com/igor-krawczuk/mini-moses
Run:

pip install -e .
Navigate to the ./sparse_diffusion/analysis/orca directory and compile orca.cpp:

g++ -O2 -std=c++11 -o orca orca.cpp

Run the code

All code is currently launched through python3 main.py. Check hydra documentation (https://hydra.cc/) for overriding default parameters.
To run the debugging code: python3 main.py +experiment=debug.yaml. We advise to try to run the debug mode first before launching full experiments.
To run a code on only a few batches: python3 main.py general.name=test.
You can specify the dataset with python3 main.py dataset=guacamol. Look at configs/dataset for the list of datasets that are currently available
You can specify the edge fraction (denoted as $\lambda$ in the paper) with python3 main.py model.edge_fraction=0.2 to control the GPU-usage

Cite the paper

@misc{qin2023sparse,
      title={Sparse Training of Discrete Diffusion Models for Graph Generation}, 
      author={Yiming Qin and Clement Vignac and Pascal Frossard},
      year={2023},
      eprint={2311.02142},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Troubleshooting

PermissionError: [Errno 13] Permission denied: 'SparseDiff/sparse_diffusion/analysis/orca/orca': You probably did not compile orca.

qym7 / SparseDiff

`Sparse denoising diffusion for large graph generation`

Environment installation

Run the code

Cite the paper

Troubleshooting

About

Languages