Ryan-Rhys / GProTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: MIT Docs Binder DOI:10.48550/arXiv.2212.04450 fair-software.eu CodeFactor Code style: black

Documentation | Paper

A Gaussian Process Library for Molecules, Proteins and Reactions.

What's New?

BNN Regression on Molecules Open In Colab
Bayesian Optimisation Over Molecules Open In Colab

Install

We recommend using a conda virtual environment:.

conda env create -f conda_env.yml

pip install --no-deps rxnfp
pip install --no-deps drfp
pip install transformers

Optional for running tests.

pip install gpflow grakel

Example usage

BNN Regression on Molecules

Tutorial (BNN Regression on Molecules) Docs
Open In Colab(https://colab.research.google.com/assets/colab-badge.svg)
from gauche.dataloader import DataLoaderMP
from gauche.dataloader.data_utils import transform_data
from sklearn.model_selection import train_test_split

loader = DataLoaderMP()
loader.load_benchmark(dataset, dataset_paths[dataset])
loader.featurize(feature)
X = loader.features
y = loader.labels

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_set_size, random_state=i)

#  We standardise the outputs but leave the inputs unchanged
_, y_train, _, y_test, y_scaler = transform_data(X_train, y_train, X_test, y_test)

Bayesian Optimisation Over Molecules

Tutorial (Bayesian Optimisation Over Molecules) Docs
Open In Colab(https://colab.research.google.com/assets/colab-badge.svg)
from botorch.models.gp_regression import SingleTaskGP
from gprotorch.kernels.fingerprint_kernels.tanimoto_kernel import TanimotoKernel

# We define our custom GP surrogate model using the Tanimoto kernel
class TanimotoGP(SingleTaskGP):

    def __init__(self, train_X, train_Y):
        super().__init__(train_X, train_Y, GaussianLikelihood())
        self.mean_module = ConstantMean()
        self.covar_module = ScaleKernel(base_kernel=TanimotoKernel())
        self.to(train_X)  # make sure we're on the right device/dtype

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return MultivariateNormal(mean_x, covar_x)

Citing

If GAUCHE is useful for your work please consider citing the following paper:

@misc{griffiths2022gauche,
      title={GAUCHE: A Library for Gaussian Processes in Chemistry}, 
      author={Ryan-Rhys Griffiths and Leo Klarner and Henry B. Moss and Aditya Ravuri and Sang Truong and Bojana Rankovic and Yuanqi Du and Arian Jamasb and Julius Schwartz and Austin Tripp and Gregory Kell and Anthony Bourached and Alex Chan and Jacob Moss and Chengzhi Guo and Alpha A. Lee and Philippe Schwaller and Jian Tang},
      year={2022},
      eprint={2212.04450},
      archivePrefix={arXiv},
      primaryClass={physics.chem-ph}
}

About

License:MIT License


Languages

Language:Jupyter Notebook 76.0%Language:Python 23.8%Language:TeX 0.1%Language:Makefile 0.0%