pozzo-research-group / HEAD

Supporting code for the "Autonomous retrosynthesis of gold nanoparticles via spectral shape matching" paper. DOI : 10.1039/D2DD00025C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bayesian Optimization branch for HEAD

This branch contains code for the Bayesian Optimization part.

We recommend installing the package by following the instructions below.

In a Linux command shell with a anaconda installation of python,

conda create --name shapematching python=3.7
conda activate shapematching
conda install pip
conda install -c conda-forge fdasrsf
git clone -b BO https://github.com/pozzo-research-group/HEAD.git
cd HEAD
pip install -r require.txt
pip install -e .

The above installation looks slightly uncommon but this is a working solutions. Installation of the package fdasrsf_python using pip is broken.

Additionally, to run casestudies for the paper, install geomstats package using the following:

git clone -b shapematching_paper https://github.com/kiranvad/geomstats.git
cd geomstats
pip install .

Example using the Gaussian function simulator can be performed as follows:

Import the required function using the below code snippet:

from head import opentrons
import pandas as pd
import numpy as np
from scipy.spatial import distance
import warnings
warnings.filterwarnings("ignore")

We need a simulator to mimic a robotic experiment. We achieve this using the following:

class Simulator:
    def __init__(self):
        self.domain = np.linspace(-5,5,num=100)
        
    def generate(self, mu, sig):
        scale = 1/(np.sqrt(2*np.pi)*sig)
        return scale*np.exp(-np.power(self.domain - mu, 2.) / (2 * np.power(sig, 2.)))
    
    def process_batch(self, Cb, fname):
        out = []
        for c in Cb:
            out.append(self.generate(*c))
        out = np.asarray(out)
        df = pd.DataFrame(out.T, index=self.domain)
        df.to_excel(fname, engine='openpyxl')
        
        return 
    
    def make_target(self, ct):
        return self.domain, self.generate(*ct)

The simulator simply returns a Gaussian distribution function given a mu and sigma values. We use this to specify a target distribution:

sim = Simulator()
target = np.array([-2,0.5])
xt, yt = sim.make_target(target)

Set up your design space using the lower and upper limits

Cmu = [-5,5]
Csig = [0.1,3.5]
bounds = [Cmu, Csig]

Define a distance metric function

from head import AmplitudePhaseDistance

def APdist(f1,f2):
    da, dp = AmplitudePhaseDistance(f1,f2,xt)
    
    return -(da+dp)

Initiate the optimizer using the following:

optim = opentrons.Optimizer(xt, yt, 
                            bounds, 
                            savedir = '../data',
                            batch_size=4,
                            metric = metric_function
                           )

Perform a random iteration

# random iteration
optim.save()
C0 = np.load('../data/0/new_x.npy')
sim.process_batch(C0, '../data/opentrons/0.xlsx')
optim.update('../data/0.xlsx')
optim.save()
optim.get_current_best()

Perform the BO iterations with a specified budget

for i in range(1,21):
    # iteration i selection
    optim.suggest_next()
    optim.save()
    # simulate iteration i new_x 
    Ci = np.load('../data/%d/new_x.npy'%i)
    sim.process_batch(Ci, '../data/%d.xlsx'%i)
    optim.update('../data/%d.xlsx'%i)
    optim.save()
    optim.get_current_best()

Note that when a robotic experiment is involved, each iteration has to be performed with the robot in the loop thus we would perform the for loop one at a time. In a Jupyter notebook format, we would do this one iteration at a time, keeping the Kernel active and adding one new cell for each iteration below the previous iteration but performing same set of operations. A more neater approach for this is under the works. At any given iteration, the function get_current_best reports what the algorithm thinks is a the best match so far.

About

Supporting code for the "Autonomous retrosynthesis of gold nanoparticles via spectral shape matching" paper. DOI : 10.1039/D2DD00025C

License:MIT License


Languages

Language:Jupyter Notebook 98.6%Language:Python 1.4%