MultiRun v 0.0.14

A Matlab tool for facilitating parameter calibration of DETIM/DEBaM. Given a valid input.txt, MultiRun can take arbitrary ranges of arbitrary numbers of model parameters, and then execute DEBaM or DETIM for every permutation of these parameters. Model output is directed into folders with unique alphanumeric names to ensure that output data is not overwritten, and that lengthy model runs do not need to be repeated. Model performance of each run is tabulated in a single output file multi_performance.txt which includes the content of modelperformance.txt that is generated by detim/debam plus additional parameters (such as the correlation between measured and observed point balances). Multi_performance.txt can then be analyzed by the user to extract the best parameter configuration(s) for the studied case.

As of version 0.0.8, MultiRun is compatible with versions 2.x.x of DEBaM and DETIM, and incompatible with earlier versions. Version 0.0.7, which is compatible with earlier versions on the models, is still available here.

The tool was developed by Lyman Gillispie and Regine Hock, and coded by Lyman Gillispie. The tool was first released in fall 2013.

What you need:

Matlab and Compiled versions of DEBaM or DETIM. If you are on Windows, and are using Cygwin to compile and run the models
A valid input.txt file for the models. In this file, you should set all of the parameters you would like the model to use, with the exceptions of those MultiRun will change for each model run.
Input data for the models. See the model documentation for setting these up.

Installation

Download and compile the latest version of DEBaM and DETIM. MultiRun supports versions 2.x.x of the models. If you already have a copy on your computer, there is no need the re-download the model, just use the copy you currently have.
Download MultiRun, either with git, or download the zipball.
Important: Move the folder +MultiRun into your Matlab Path; it's not enough to navigate to the containing folder with Matlab, as the script changes the working directory. Mathworks has helpful documentation about the Matlab Path and how to change it.
Check that Matlab can find MultiRun by typing:
```
which MultiRun.modelMultiRun
```
in Matlab. This should return the filename of your installed copy of MultiRun, if Matlab tells you something like "'MultiRun.modelMultiRun' not found." double check that +MultiRun was placed into the Matlab path.
You're done! MultiRun is now available to Matlab, and you can access its functions and classes via MultiRun.<function/classname>.

A Caveat for Windows Users

Windows users who have compiled the model with Cygwin will need to copy the file cygwin1.dll into the same directory as the model executeables (usually meltmodel\bin, but whereever you have placed detim.exe and debam.exe). Typically, Cygwin installs cygwin1.dll to C:\\Cygwin\bin\cygwin1.dll, but depending on your Cygwin installation, it may be located elsewhere (c:\\Program Files\Cygwin\bin, for instance). Copying this file to the same folder as debam/detim is sufficient for most applications.

Using MultiRun to Execute DEBaM/DETIM

Configure an input.txt parameter file for DEBaM/DETIM. Set all of the parameters for the model run to the values you want them to be when the model is run, except for the values you will change using MultiRun, these can be set to anything.
Open Matlab.
Run the MultiRun.modelMultiRun command:
```
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(<model filename>, <base input.txt>,  <varargin>)
```
Where you set:
- <model filename> to the full path and file name of the executeable of the model (debam or detim) on your machine.
- <base input.txt> to the full path and file name of the baseinput.txt. The filename is arbitrary.
- <varargin> to the parameters you want changed in your base input.txt. The syntax here is similar to that of Matlab's plot function; you enter a list of key-value pairs for the parameters you wish to change, i.e. if we want icekons to take on all of the values between 5 and 6, at intervals of 0.1, and rockkons to take on every value between 0 and 0.5, at intervals of 0.025, we would call
```
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(<model path+filename>, <base input.txt>, 'icekons', [5:0.1:6], 'rockkons', [0:0.025:0.5])
```
For better readability you may use variables to specify the model and configuration file, for example:
```
ModelName = '/github/source/meltmodel/bin/detim';
```
```
InputName = '/github/source/meltmodel/example/input.txt';
```

[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(ModelName, InputName, 'icekons', [5:0.5:6], 'firnkons', [360:

Note that the variable name ```modelName``` and ```InputName``` are arbitrary, and you need to adjust the paths to your case.
The example includes 2 parameters to be calibrated (e.g. to be altered in the model runs) but you can include as many as you like at the
same time. Just add more to the command above including their ranges with the same syntax as above.

### Output
Output of each model run is located in a subdirectory of the ```outpath```
  specified in the base ```input.txt```, you will find new subdirectories
  in ```outpath``` with long alpha-numeric names, e.g.

├── a729f3b8529edf74e2d57cf64ba8cc91fc64907e │ └── model_output ├── cdad489ead2ec45681e9a5ec91516623c88433ce │ └── model_output ├── d5c3a35650dcde462cc5eaea301cfbb62cd050d9 │ └── model_output └── fbf399de8a6ac388aad1647ad918d37f9a3d81f1 └── model_output ``` Each directory contains: * The ```input.txt``` for *this* run of the model, with parameters changed from the base parameter file. * A ```changes.txt``` file which describes the changes made to the original ```input.txt```. * A directory ```model_output```, containing the output from this run of the model. This is ```outpath``` in ```input.txt``` for this model run. * A file ```runstatus.lock```. This assists ```MultiRun``` to prevent re-running this parameter configuration more than once. If you need to re-run this configuration, delete ```runstatus.lock``` before doing so.

Model Performance

In addition, the outpath folder specified in your base input.txt will contain a tab-seperated file multi_performance.txt. This file contains a summary of model performance statistics and the changes made to the base input.txt. Additionally, the SHA-1 hash, which determines the output folder for that specific run, is listed so that the outout of any particular model run can be easily found.

multi_performance.txt can then be used to extract the best parameter configuration(s) based on the multi-criteria performance statistics. The user should try to maximize r2-values for discharge and/or point balances, and minimize the discharge difference and/or the point balance RMSE. In addition, modeled total balance is given and can be compared with measured total balance over the entire simulation period (e.g. from geodetic methods), if available. The final choice of parameter configurations may be subjective in case of ambiguous model performance for different criteria.

The performance variables are (order of columns in file):

massbal_r2: coefficient of determination between measured and modeled point mass balances
massbal_rmse: root mean square error between measured and modeled point mass balances
Q_R2: Nash-Sutcliffe coefficient indicating agreement between measured and modeled discharge (-infinity to 1)
Q_lnR2: same but using logarithmic discharge (to evaluate agreement during low-flow conditions
Qvolumesim: Total simulated discharge volume in 100,000 m3 over simulation period
Qvolumemeas: Total measured discharge volume in 100,000 m3 over simulation period (only time steps with valid data)
nsteps: number of time steps simulated
nstepsdis: number of time steps of nsteps that contain valid discharge data.
totalglacierwidemassbalance(m): total mass balance over the entire glacier over the entire simulation period.

Note if any of the variables are not available (for example, if you don't have discharge data), MultiRun will fill those columns with -9999, i.e. the order of variables in the output file remains constant no matter available observations.

The columns which follow give the values of the parameters changed by MultiRun.

In Matlab, modelMultiRun has returned quite a bit of additional information to you which you can use to further manipulate your finished runs. These are outlined in the API below.

Have Fun!

API (Available Functions/Objects and How to Use Them)

It is possible to use MultiRun programmatically in your Matlab scripts, below is the API describing

function modelMultiRun

To run the model multiple times the function modelMultiRun is available.

[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(modelpath, basefile, varargin)

Arguments:

modelpath - fully qualified path to the executable of the model you want to run (debam/detim).
basefile - fully qualified path to a valid parameter file for the model, this will be modified based upon the the list of key-value pairs passed to varargin
varargin - a list of key-value pairs which are modified, e.g. modelMultiRun('debam', 'input.txt', 'icekons', [5:0.1:6]) will run the model with icekons set to each value in [5:0.1:6].

Returns:

hashes - Cell array of hashes of each run (yields the name of a model run's output folder)
status - status(i) = Array of return status of run with hash hashes{i}
err - Error messages associated by incomplete runs
changes - array of changes made to input.txt
runs - a container.Maps indexed by hashes of HashedRun objects, each corresponding to a single model run.

E.g.

Let base_input.txt, be a valid parameter file for DEBaM/DETIM (aside from the file name), and a copy of detim be located at /home/luser/local/bin/detim. In Matlab, we run:

basefile = '/home/luser/base_input.txt';
modelpath = '/home/luser/local/bin/detim';
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(modelpath, basefile, 'icekons', [5, 6.0], 'firnkons', [350, 351]);

at the command line. MultiRun will take the parameter file from /home/luser/base_input.txt and generate input.txt files which contain every combination of the parameters icekons and firnkons, given the values you've assigned them (in this case there are four combinations). Each generated file is placed in a directory whose name is the SHA-1 hash of the modified parameter file, and the model's output is directed to that folder as well. If outpath is set to /home/luser/mytest in base_input.txt, this model run will result in a directory structure which looks like:

mytest/
└── output
    ├── a729f3b8529edf74e2d57cf64ba8cc91fc64907e
    │   └── model_output
    ├── cdad489ead2ec45681e9a5ec91516623c88433ce
    │   └── model_output
    ├── d5c3a35650dcde462cc5eaea301cfbb62cd050d9
    │   └── model_output
    └── fbf399de8a6ac388aad1647ad918d37f9a3d81f1
        └── model_output

After generating the parameter file, Matlab changes its current working directory to the folder containing the parameter file, and executes the command in modelpath (in our example detim). Model output is put into the folder <hash>/model_output.

modelMultiRun also writes a file changes.txt which lists the changes made to to the original parameter file for that run, i.e.:

$ cd mytest/output/a729f3b8529edf74e2d57cf64ba8cc91fc64907e/
$ ls
changes.txt     input.txt      outpath        runstatus.lock
$ cat changes.txt
Base parameter file: /home/luser/mytest/input.txt
New parameter file: /home/luser/mytest/output/a729f3b8529edf74e2d57cf64ba8cc91fc64907e/index.txt
Changes made:
  icekons = 6
  firnkons = 350

What happens:

For each run, MultiRun uses the helper class MultiRun.HashedRun to generate a nearly-unique alpha-numeric identifier for that particular model run. This is done by:

Generating an input.txt file from the passed parameters,
Taking the SHA-1 hash of the text in this file
Modifying the outpath parameter of the containers.Map containing the configuration to <original-outpath>/<SHA hash>/outpath/.

multiModelRun returns several cell-arrays; entries with the same index correspond to the same model run

hashes: the SHA-1 hash of the input.txt parameter files
status: Returned status of the model run.
err: Sting detailing any errors encountered; is empty if none occur
changes: Strings listing any changes made to base_input for this run
runs: a containers.Map object, whose keys are the hashes of each run, and whose values are the HashedRun objects of each run.

HashedRun checks to see whether or not there is already a parameter file in the new output path. This is done by checking a lockfile runstatus.lockfile, the contents of which tell whether or not this parameter file has been run, is currently running, or is waiting to be run.
If no lockfile is found, the new configuration file is written to disk at <original-outpath>/<SHA hash>/input.txt.
If a lockfile is found, it's status is returned and an error is posted.
The model is run as a system subprocess, if any errors occur status[i] is set, and an error message is returned.
In particular, if every entry in status is 1, we know that every model run exited successfully.

Each run is managed by a helper class called HashedRun. The HashedRun class manages the run; building a directory structure for that run alone, and ensuring that each configuration is only run once.

class HashedRun

Each run is managed by a HashedRun class, which makes the appropriate directories, checks to see if the run has been completed previously

Methods:

hr = MultiRun.HashedRun(config, model) Object constructor function Arguments:
- config: The text of a valid Model 'input.txt'
- model: the fully-qualified path for the model executable
[success, err] = genConfig(self) Generate this run's input.txt, and write it to disk. Returns:
- success : success code is
  - 0 something has gone wrong
  - 1 parameter file has been generated
  - 2 parameter file already existed
- err: error message, if empty everything is fine
[success, err] = runModel(self) Execute the model, checking to make sure that the model hasn't run already. Returns:
- success: codes for completion are:
  - 0 : an error has occurred
  - 1 : The model has been run successfully
  - 2 : the lockfile indicates the model has already run
- err: error message.

Properties:

configMap: Map container containing info for the model run
hash: SHA-1 hash of input.txt corresponding to configMap
model: Fully qualified path to model executable
outPathL Path where model will be outputting

License

BSD

lyguy / matlab_MultiRun