Kaziaa / docking

GNN enabled surrogate modeling for chemical docking

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep Surrogate Docking: Accelerating Automated Drug Discovery with Graph Neural Networks

This repository is the official implementation of Hosseini, Simini, Clyde, and Ramanathan's Deep Surrogate Docking: Accelerating Automated Drug Discovery with Graph Neural Networks.

Requirements

To install requirements:

pip install -r requirements.txt

đź“‹ Note: Pytorch geometric may require a seperate installation process, depending on the system being used. If Pytorch Geometric cannot be installed using the requirements file, refer to the Pytorch Geometric Installation Guide to install Pytorch Geometric, and then use the requirements file to install all other dependencies.

Data

The ZINC dataset subset used in this project, along with the docking scores obtained by Lyu et al (2019), can be downloaded directly from the authors here. The default settings in graph-dock/config.json expect this data to have the following path ./data/d4_table_name_smi_energy_hac_lte_25_title.csv. However, this can easily be modified in the configuration file. 

Before running any training or inference, this data needs to be preprocessed using the preprocess_data function in util.py. Invoking util.py as a script in the root directory of the repository will create a preprocessed version of the data consistent with current configuration settings in graph-dock/config.json. All other (in memory) preprocessing is handled automatically by the training script.

Training

To train the model(s) in the paper, modify the configuration file ./graph-dock/config.json as needed and run this command:

python train.py

The provided ./graph-dock/config.json file contains the default hyperparameters used to obtain the FiLMv2 results in our work.

Evaluation

To evaluate a model on a subset of the ZINC dataset, run:

python inference.py 

Note that this script also depends on the ./graph-dock/config.json file, which currently contains the default parameters used to obtain the results in our work.

Results

Our model achieves the following performance on a witheld test partition of the ZINC dataset:

Model name W-MSE RES Score
GIN 0.402 0.742
GAT 0.396 0.763
FiLM 0.389 0.768
FiLMv2 0.383 0.773

Please refer to our paper for more details.

Contributing

We greatly welcome suggestions and contributions to our code! Please feel free to fork this repository, hack away, and submit a pull request.

Citing this work

If you find our work useful, please cite our work using the following BibTeX:

@inproceedings{deep_surrogate_dock,
  author    = {Hosseini, Ryien and Simini, Filippo and Clyde, Austin and Ramanathan, Arvind},
  title     = {Deep Surrogate Docking: Accelerating Automated Drug Discovery with Graph Neural Networks
},
  year      = {2022},
  maintitle = {Advances in Neural Information Processing Systems 35 (2022)},
  booktitle = {Workshop on AI for Science: Progress and Promises},
}

About

GNN enabled surrogate modeling for chemical docking

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%