A Gaussian Process Library for Molecules, Proteins and Reactions.
BNN Regression on Molecules | |
Bayesian Optimisation Over Molecules |
We recommend using a conda virtual environment:.
conda env create -f conda_env.yml
pip install --no-deps rxnfp
pip install --no-deps drfp
pip install transformers
Optional for running tests.
pip install gpflow grakel
Tutorial (BNN Regression on Molecules) | Docs |
from gauche.dataloader import DataLoaderMP
from gauche.dataloader.data_utils import transform_data
from sklearn.model_selection import train_test_split
loader = DataLoaderMP()
loader.load_benchmark(dataset, dataset_paths[dataset])
loader.featurize(feature)
X = loader.features
y = loader.labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_set_size, random_state=i)
# We standardise the outputs but leave the inputs unchanged
_, y_train, _, y_test, y_scaler = transform_data(X_train, y_train, X_test, y_test)
Tutorial (Bayesian Optimisation Over Molecules) | Docs |
from botorch.models.gp_regression import SingleTaskGP
from gprotorch.kernels.fingerprint_kernels.tanimoto_kernel import TanimotoKernel
# We define our custom GP surrogate model using the Tanimoto kernel
class TanimotoGP(SingleTaskGP):
def __init__(self, train_X, train_Y):
super().__init__(train_X, train_Y, GaussianLikelihood())
self.mean_module = ConstantMean()
self.covar_module = ScaleKernel(base_kernel=TanimotoKernel())
self.to(train_X) # make sure we're on the right device/dtype
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return MultivariateNormal(mean_x, covar_x)
If GAUCHE is useful for your work please consider citing the following paper:
@misc{griffiths2022gauche,
title={GAUCHE: A Library for Gaussian Processes in Chemistry},
author={Ryan-Rhys Griffiths and Leo Klarner and Henry B. Moss and Aditya Ravuri and Sang Truong and Bojana Rankovic and Yuanqi Du and Arian Jamasb and Julius Schwartz and Austin Tripp and Gregory Kell and Anthony Bourached and Alex Chan and Jacob Moss and Chengzhi Guo and Alpha A. Lee and Philippe Schwaller and Jian Tang},
year={2022},
eprint={2212.04450},
archivePrefix={arXiv},
primaryClass={physics.chem-ph}
}