MultiRun v 0.0.14
A Matlab tool for facilitating parameter calibration of DETIM/DEBaM.
Given a valid input.txt
, MultiRun can take arbitrary ranges of
arbitrary numbers of model
parameters, and then execute DEBaM or DETIM for every permutation
of these parameters. Model output is directed into folders with
unique alphanumeric names to ensure that output data is not overwritten,
and that lengthy model runs do not need to be repeated. Model performance of each
run is tabulated in a single output file multi_performance.txt
which includes
the content of modelperformance.txt
that is generated by detim/debam plus additional
parameters (such as the correlation between measured and observed point balances).
Multi_performance.txt
can then be analyzed by the user to extract the best parameter
configuration(s) for the studied case.
As of version 0.0.8, MultiRun is compatible with versions 2.x.x of DEBaM and DETIM, and incompatible with earlier versions. Version 0.0.7, which is compatible with earlier versions on the models, is still available here.
The tool was developed by Lyman Gillispie and Regine Hock, and coded by Lyman Gillispie. The tool was first released in fall 2013.
What you need:
- Matlab and Compiled versions of DEBaM or DETIM. If you are on Windows, and are using Cygwin to compile and run the models
- A valid
input.txt
file for the models. In this file, you should set all of the parameters you would like the model to use, with the exceptions of those MultiRun will change for each model run. - Input data for the models. See the model documentation for setting these up.
Installation
-
Download and compile the latest version of DEBaM and DETIM. MultiRun supports versions 2.x.x of the models. If you already have a copy on your computer, there is no need the re-download the model, just use the copy you currently have.
-
Download MultiRun, either with
git
, or download the zipball. -
Important: Move the folder
+MultiRun
into your Matlab Path; it's not enough to navigate to the containing folder with Matlab, as the script changes the working directory. Mathworks has helpful documentation about the Matlab Path and how to change it. -
Check that Matlab can find MultiRun by typing:
which MultiRun.modelMultiRun
in Matlab. This should return the filename of your installed copy of MultiRun, if Matlab tells you something like "'MultiRun.modelMultiRun' not found." double check that
+MultiRun
was placed into the Matlab path. -
You're done! MultiRun is now available to Matlab, and you can access its functions and classes via
MultiRun.<function/classname>
.
A Caveat for Windows Users
Windows users who have compiled the model with Cygwin will need
to copy the file cygwin1.dll
into the same directory as the
model executeables (usually meltmodel\bin
, but whereever you
have placed detim.exe
and debam.exe
). Typically, Cygwin
installs cygwin1.dll
to C:\\Cygwin\bin\cygwin1.dll
, but depending
on your Cygwin installation, it may be located elsewhere (c:\\Program Files\Cygwin\bin
,
for instance). Copying this file to the same folder as debam/detim
is sufficient
for most applications.
Using MultiRun to Execute DEBaM/DETIM
-
Configure an
input.txt
parameter file for DEBaM/DETIM. Set all of the parameters for the model run to the values you want them to be when the model is run, except for the values you will change using MultiRun, these can be set to anything. -
Open Matlab.
-
Run the
MultiRun.modelMultiRun
command:[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(<model filename>, <base input.txt>, <varargin>)
Where you set:
<model filename>
to the full path and file name of the executeable of the model (debam or detim) on your machine.<base input.txt>
to the full path and file name of the baseinput.txt
. The filename is arbitrary.<varargin>
to the parameters you want changed in your baseinput.txt
. The syntax here is similar to that of Matlab'splot
function; you enter a list of key-value pairs for the parameters you wish to change, i.e. if we wanticekons
to take on all of the values between 5 and 6, at intervals of 0.1, androckkons
to take on every value between 0 and 0.5, at intervals of 0.025, we would call
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(<model path+filename>, <base input.txt>, 'icekons', [5:0.1:6], 'rockkons', [0:0.025:0.5])
For better readability you may use variables to specify the model and configuration file, for example:
ModelName = '/github/source/meltmodel/bin/detim';
InputName = '/github/source/meltmodel/example/input.txt';
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(ModelName, InputName, 'icekons', [5:0.5:6], 'firnkons', [360:
Note that the variable name ```modelName``` and ```InputName``` are arbitrary, and you need to adjust the paths to your case.
The example includes 2 parameters to be calibrated (e.g. to be altered in the model runs) but you can include as many as you like at the
same time. Just add more to the command above including their ranges with the same syntax as above.
### Output
Output of each model run is located in a subdirectory of the ```outpath```
specified in the base ```input.txt```, you will find new subdirectories
in ```outpath``` with long alpha-numeric names, e.g.
Model Performance
In addition, the outpath
folder specified in your base input.txt
will contain a tab-seperated file multi_performance.txt
.
This file contains a summary of model performance statistics and the changes made
to the base input.txt
. Additionally, the SHA-1 hash, which determines the
output folder for that specific run, is listed so that the outout of any particular
model run can be easily found.
multi_performance.txt
can then be used to extract the best parameter configuration(s) based
on the multi-criteria performance statistics. The user should try to maximize r2-values for discharge and/or
point balances, and minimize the discharge difference and/or the point balance RMSE. In addition,
modeled total balance is given and can be compared with measured total balance over the entire simulation period
(e.g. from geodetic methods), if available. The final choice of parameter configurations may be subjective
in case of ambiguous model performance for different criteria.
The performance variables are (order of columns in file):
massbal_r2
: coefficient of determination between measured and modeled point mass balancesmassbal_rmse
: root mean square error between measured and modeled point mass balancesQ_R2
: Nash-Sutcliffe coefficient indicating agreement between measured and modeled discharge (-infinity to 1)Q_lnR2
: same but using logarithmic discharge (to evaluate agreement during low-flow conditionsQvolumesim
: Total simulated discharge volume in 100,000 m3 over simulation periodQvolumemeas
: Total measured discharge volume in 100,000 m3 over simulation period (only time steps with valid data)nsteps
: number of time steps simulatednstepsdis
: number of time steps of nsteps that contain valid discharge data.totalglacierwidemassbalance(m)
: total mass balance over the entire glacier over the entire simulation period.
Note if any of the variables are not available (for example, if you don't have discharge data), MultiRun will fill those columns with -9999, i.e. the order of variables in the output file remains constant no matter available observations.
The columns which follow give the values of the parameters changed by MultiRun.
In Matlab, modelMultiRun
has returned quite a bit of additional information to you
which you can use to further manipulate your finished runs. These are outlined
in the API below.
Have Fun!
API (Available Functions/Objects and How to Use Them)
It is possible to use MultiRun programmatically in your Matlab scripts, below is the API describing
function modelMultiRun
To run the model multiple times the function modelMultiRun
is available.
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(modelpath, basefile, varargin)
Arguments:
modelpath
- fully qualified path to the executable of the model you want to run (debam/detim).basefile
- fully qualified path to a valid parameter file for the model, this will be modified based upon the the list of key-value pairs passed to vararginvarargin
- a list of key-value pairs which are modified, e.g.modelMultiRun('debam', 'input.txt', 'icekons', [5:0.1:6])
will run the model withicekons
set to each value in[5:0.1:6]
.
Returns:
hashes
- Cell array of hashes of each run (yields the name of a model run's output folder)status
- status(i) = Array of return status of run with hash hashes{i}err
- Error messages associated by incomplete runschanges
- array of changes made to input.txtruns
- a container.Maps indexed by hashes of HashedRun objects, each corresponding to a single model run.
E.g.
Let base_input.txt
, be a valid parameter file for DEBaM/DETIM (aside from the file name),
and a copy of detim
be located at /home/luser/local/bin/detim
.
In Matlab, we run:
basefile = '/home/luser/base_input.txt';
modelpath = '/home/luser/local/bin/detim';
[hashes, status, err, changes, runs] = MultiRun.modelMultiRun(modelpath, basefile, 'icekons', [5, 6.0], 'firnkons', [350, 351]);
at the command line. MultiRun will take the parameter file
from /home/luser/base_input.txt
and generate input.txt
files which contain every combination of the parameters
icekons
and firnkons
, given the values you've assigned them
(in this case there are four combinations).
Each generated file is placed in a directory whose name is the SHA-1 hash
of the modified parameter file, and the model's output is directed to that folder as well.
If outpath
is set to /home/luser/mytest
in base_input.txt
,
this model run will result in a directory structure which looks like:
mytest/
└── output
├── a729f3b8529edf74e2d57cf64ba8cc91fc64907e
│ └── model_output
├── cdad489ead2ec45681e9a5ec91516623c88433ce
│ └── model_output
├── d5c3a35650dcde462cc5eaea301cfbb62cd050d9
│ └── model_output
└── fbf399de8a6ac388aad1647ad918d37f9a3d81f1
└── model_output
After generating the parameter file, Matlab changes its
current working directory to the folder containing the parameter
file, and executes the command in modelpath
(in our example detim
).
Model output is put into the folder <hash>/model_output
.
modelMultiRun
also writes a file changes.txt
which lists the changes made to to the original parameter file for that run, i.e.:
$ cd mytest/output/a729f3b8529edf74e2d57cf64ba8cc91fc64907e/
$ ls
changes.txt input.txt outpath runstatus.lock
$ cat changes.txt
Base parameter file: /home/luser/mytest/input.txt
New parameter file: /home/luser/mytest/output/a729f3b8529edf74e2d57cf64ba8cc91fc64907e/index.txt
Changes made:
icekons = 6
firnkons = 350
What happens:
For each run, MultiRun uses the helper class MultiRun.HashedRun
to
generate a nearly-unique alpha-numeric identifier for that particular model run.
This is done by:
- Generating an
input.txt
file from the passed parameters, - Taking the SHA-1 hash of the text in this file
- Modifying the
outpath
parameter of thecontainers.Map
containing the configuration to<original-outpath>/<SHA hash>/outpath/
.
multiModelRun returns several cell-arrays; entries with the same index correspond to the same model run
hashes
: the SHA-1 hash of theinput.txt
parameter filesstatus
: Returned status of the model run.err
: Sting detailing any errors encountered; is empty if none occurchanges
: Strings listing any changes made tobase_input
for this runruns
: acontainers.Map
object, whose keys are the hashes of each run, and whose values are the HashedRun objects of each run.
-
HashedRun checks to see whether or not there is already a parameter file in the new output path. This is done by checking a lockfile
runstatus.lockfile
, the contents of which tell whether or not this parameter file has been run, is currently running, or is waiting to be run. -
If no lockfile is found, the new configuration file is written to disk at
<original-outpath>/<SHA hash>/input.txt
. -
If a lockfile is found, it's status is returned and an error is posted.
-
The model is run as a system subprocess, if any errors occur
status[i]
is set, and an error message is returned. -
In particular, if every entry in
status
is1
, we know that every model run exited successfully.
Each run is managed by a helper class called HashedRun
. The HashedRun
class
manages the run; building a directory structure for that run alone, and ensuring that
each configuration is only run once.
class HashedRun
Each run is managed by a HashedRun
class, which makes the
appropriate directories, checks to see if the run has been completed
previously
Methods:
-
hr = MultiRun.HashedRun(config, model)
Object constructor function Arguments:config
: The text of a valid Model 'input.txt'model
: the fully-qualified path for the model executable
-
[success, err] = genConfig(self)
Generate this run's input.txt, and write it to disk. Returns:success
: success code is- 0 something has gone wrong
- 1 parameter file has been generated
- 2 parameter file already existed
err
: error message, if empty everything is fine
-
[success, err] = runModel(self)
Execute the model, checking to make sure that the model hasn't run already. Returns:success
: codes for completion are:- 0 : an error has occurred
- 1 : The model has been run successfully
- 2 : the lockfile indicates the model has already run
err
: error message.
Properties:
configMap
: Map container containing info for the model runhash
: SHA-1 hash of input.txt corresponding to configMapmodel
: Fully qualified path to model executableoutPath
L Path where model will be outputting
License
BSD