TIF345-Advanced-simulation-and-machine-learning

Repository for projects in the Chalmers course "TIF345 / FYM345 Advanced simulation and machine learning" 2020.

Authors: Sebastian Holmin and Erik Andersson

The code in this repository was used to produce the following four reports:

Cosmological models

In this project, we use utilize the Supernova Cosmology Project (SCP) data to analyze and compare cosmological models. The SCP 2.1 dataset contains detailed measurements of theredshift, z, and distance moduli μ, of several supernovea, which we utilize to perform Bayesianparameter estimations and model comparisons.

Alloy cluster expansions

In this report we investigate the issue of parameter selection and estimation in cluster expansions of alloys. To do this we use the icet package which implements symmetry transformations to expand the mixing energy of alloy structures into

where $N_\alpha$ is the number of a $\alpha$ -clusters per atom and $J_\alpha$ is the effective cluster interaction (ECI), which are the parameters that we seek to estimate from energy data.

Assuming i.i.d. errors this can be written in matrix notation as $\boldsymbol{E}=\boldsymbol{X}\boldsymbol{J}+\epsilon$ , with $\epsilon\sim\mathcal{N}(0,\sigma^2)$ and thus the likelihood function is given by

The Bayesian and Akaike information criteria are defined at the maximum likelihood, which can be shown to be equivalent to

where MSE is the mean squared error.

Bayesian Optimization: Searching for the global minima

In this project we investigate the use of Gaussian Processes (GP) to model the potential energy surface (PES) for adding a Au atom to a Au slab, i.e. the difference in average energy per atom between the slab with and without the extra atom. To sample the energy we use an embedded medium theory (EMT) calculator provided in the asap3 package. Sampling the energy this way is resource intensive, so GPs are likely well suited method for reducing the computational time for modeling the PES.

A Galton board on a rocking ship

In this report we investigate the use of the approximate Bayesian computation (ABC) algorithm, supported by a neural network (NN), to reverse engineer the parameters for a toy model of a Galton board on a rocking ship.

A Galton board (bean machine) is a device that produces a normal distribution by utilizing the law of big numbers. It consists of rows of pegs where balls can roll a step to the left or the right at each row. Our toy model has 31 rows, giving 32 possible end positions for each ball, and two parameters $\alpha$ and $s$ . The parameter $\alpha$ describes a simplified moment of inertia, i.e. the tendency for a ball to continue rolling in the same direction again for the next peg, while $s$ describes the incline of the rocking ship that the Galton board is situated on. The probability of a ball rolling to the right is then given by

where $\alpha\in[0,0.5]$ and $s\in[-0.25,0.25]$ and $M=-0.5$ if the ball previously rolled to the left and $0.5$ if it rolled to the right.

We are faced with the task of determining an unknown (but constant) $\alpha$ from a 'black box' function that simulates the final positions of 1000 balls. This is done for an unknown, randomly chosen, latent variable $s$ . To help us with this task we will implement our own simulator where we can control the parameters and analyse the behaviour.

Phrased in a Bayesian language, given a (set of) simulated distributions $y_m$ , find the posterior distribution

where we have first used the law of total probability to write the likelihood as marginalized over the latent variable $s$ and then Bayes theorem to write the posterior in terms of the likelihood. As this is a toy model, we will assume that $s$ is drawn uniformly in its allowed interval for every run of the 'black box' simulation, and that the 'true' $\alpha$ was chosen randomly from uniform distribution. That is, we will assume that the priors $\pi(\alpha)$ and $\pi(s)$ are uniform.

The accuracy of the posterior can be increased using the results from several experiment outcomes. Given a set $y_m^{i}$ , for $i=1,...,N$ , the total posterior is given by

where we have use the fact that $\pi(\alpha)$ is uniform to insert the posterior.

Serock3 / TIF345-Advanced-simulation-and-machine-learning

TIF345-Advanced-simulation-and-machine-learning

Cosmological models

Alloy cluster expansions

Bayesian Optimization: Searching for the global minima

A Galton board on a rocking ship

About

Languages