Serock3 / TIF345-Advanced-simulation-and-machine-learning

Repo for projects in the Chalmers course "TIF345 / FYM345 Advanced simulation and machine learning" 2020. Authors: Sebastian Holmin and Erik Andersson

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TIF345-Advanced-simulation-and-machine-learning

Repository for projects in the Chalmers course "TIF345 / FYM345 Advanced simulation and machine learning" 2020.

Authors: Sebastian Holmin and Erik Andersson

The code in this repository was used to produce the following four reports:

In this project, we use utilize the Supernova Cosmology Project (SCP) data to analyze and compare cosmological models. The SCP 2.1 dataset contains detailed measurements of theredshift, z, and distance moduli μ, of several supernovea, which we utilize to perform Bayesianparameter estimations and model comparisons.

In this report we investigate the issue of parameter selection and estimation in cluster expansions of alloys. To do this we use the icet package which implements symmetry transformations to expand the mixing energy of alloy structures into

image

where formula is the number of a formula-clusters per atom and formula is the effective cluster interaction (ECI), which are the parameters that we seek to estimate from energy data.

Assuming i.i.d. errors this can be written in matrix notation as formula, with formula and thus the likelihood function is given by

image

The Bayesian and Akaike information criteria are defined at the maximum likelihood, which can be shown to be equivalent to

image

where MSE is the mean squared error.

In this project we investigate the use of Gaussian Processes (GP) to model the potential energy surface (PES) for adding a Au atom to a Au slab, i.e. the difference in average energy per atom between the slab with and without the extra atom. To sample the energy we use an embedded medium theory (EMT) calculator provided in the asap3 package. Sampling the energy this way is resource intensive, so GPs are likely well suited method for reducing the computational time for modeling the PES.

In this report we investigate the use of the approximate Bayesian computation (ABC) algorithm, supported by a neural network (NN), to reverse engineer the parameters for a toy model of a Galton board on a rocking ship.

A Galton board (bean machine) is a device that produces a normal distribution by utilizing the law of big numbers. It consists of rows of pegs where balls can roll a step to the left or the right at each row. Our toy model has 31 rows, giving 32 possible end positions for each ball, and two parameters formula and formula. The parameter formula describes a simplified moment of inertia, i.e. the tendency for a ball to continue rolling in the same direction again for the next peg, while formula describes the incline of the rocking ship that the Galton board is situated on. The probability of a ball rolling to the right is then given by

image

where formula and formula and formula if the ball previously rolled to the left and formula if it rolled to the right.

We are faced with the task of determining an unknown (but constant) formula from a 'black box' function that simulates the final positions of 1000 balls. This is done for an unknown, randomly chosen, latent variable formula. To help us with this task we will implement our own simulator where we can control the parameters and analyse the behaviour.

Phrased in a Bayesian language, given a (set of) simulated distributions formula, find the posterior distribution

image

where we have first used the law of total probability to write the likelihood as marginalized over the latent variable formula and then Bayes theorem to write the posterior in terms of the likelihood. As this is a toy model, we will assume that formula is drawn uniformly in its allowed interval for every run of the 'black box' simulation, and that the 'true' formula was chosen randomly from uniform distribution. That is, we will assume that the priors formula and formula are uniform.

The accuracy of the posterior can be increased using the results from several experiment outcomes. Given a set formula, for formula, the total posterior is given by

image

where we have use the fact that formula is uniform to insert the posterior.

About

Repo for projects in the Chalmers course "TIF345 / FYM345 Advanced simulation and machine learning" 2020. Authors: Sebastian Holmin and Erik Andersson


Languages

Language:Jupyter Notebook 99.1%Language:Python 0.9%