itakigawa / pyg_chemprop

A concise and easy-to-customize reimplementation of "ChemProp" (Yang et al, 2019) in PyTorch Geometric.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ChemProp in PyTorch Geometric

A concise and easy-to-customize reimplementation of "ChemProp" (Yang et al, 2019) in PyTorch Geometric.

Features

  • "pyg_chemprop_utils" includes a converter from smiles to a pyg object defining a molecular graph with atom- and bond- features in the original ChemProp. (this requires RDKit)
  • "pyg_chemprop.py" uses pytorch_scatter, and requires the "index of reverse edges" for ChemProp. You'll need to preprocess pyg datasets (or lists of pyg objects) like
from pyg_chemprop import RevIndexedDataset
from ogb.graphproppred import PygGraphPropPredDataset
pyg_dataset = PygGraphPropPredDataset(name="ogbg-molhiv", root="dataset/")
dataset = RevIndexedDataset(pyg_dataset)
  • "pyg_chemprop_naive.py" does not use pytorch_scatter. It's very slow, but easy to understand what is going on inside ChemProp.

Usage

See test_ogbg-molhiv.ipynb.

Speed Test

Environment

  • torch 1.8.1
  • torch_geometric 2.0.1
  • torch_scatter 2.0.8
  • ogb 1.3.2

Results

  • data: "ogbg-molhiv" training dataset (32,901 molecules)
  • batch_size: 50
  • gpu: A100-PCIE-40GB
device features time per epoch
original chemprop CPU chemprop 70 sec
GPU chemprop 15 sec
ours (w pytorch_scatter) CPU ogb default 28 sec
GPU ogb default 5 sec
ours (w pytorch_scatter) CPU chemprop 59 sec
GPU chemprop 7 sec
ours (w/o pytorch_scatter) CPU ogb default 1277 sec
GPU ogb default 1743 sec

ChemProp (Yang et al, 2019)

"ChemProp" is a simple but effective Graph Neural Network (GNN) for Molecular Property Prediction, and was successfully used in anti-biotic discovery by Machine Learning for Pharmaceutical Discovery and Synthesis Consortium (MLPDS), MIT.

Author

  • Ichigaku Takigawa

About

A concise and easy-to-customize reimplementation of "ChemProp" (Yang et al, 2019) in PyTorch Geometric.

License:MIT License


Languages

Language:Jupyter Notebook 79.4%Language:Python 20.6%