arnaud-m / grigrid

Deployment of a benchmark on localhost or a cluster.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

General information

Grigrid is a collection of scripts that eases deployment and reporting of problem solving benchmarks either on localhost or on a cluster.

Requirements

  • GNU bash, version 4.2.10(1)-release (i386-redhat-linux-gnu)
  • Python 2.7.3
  • GNU help2man 1.40.2 (documentation only)

Release

to install the programs and any data files and documentation (if help2man is installed) in your home directory.

git clone CLONE_URL
cd grigrid
make
make install

Documentation

Look for the most up-to-date documentation on the GRIGRID web site.
The documentation is also available as manpages.

Problems and Categories

Problems

There are three categories of problems:

  1. SAT : satisfaction problem;
  2. ENUM : enumeration problem;
  3. OPT : optimization problem

Categories

The solver categories are:

  1. complete
  2. incomplete

Complete solvers can determine if an instance is satisfiable or not (or find and prove the optimum) whereas incomplete solvers cannot prove the unsatisfiability or the optimum.

Execution Environment

Solvers will run either on localhost or on a cluster of computers using the Linux operating system. On a cluster, they will run under the control of another program (oar) which will enforce some limits on the memory and the total CPU time used by the program. Solvers will be run inside a sandbox that will prevent unauthorised use of the system (network connections, file creation outside the allowed directory, among others). Solvers can use several processes or threads but, it must be specified by giving oar instructions in the solver shell script. Two executions of a solver with the same parameters and system resources must output the same result in approximately the same time (so that the experiments can be repeated).

You need to provide grigrid with a suggested command line that should be used to run your solver.

./solver.sh ALGOPATH BENCHPATH RANDOMSEED

In this command line, you will be asked to use the following placeholders, which will be replaced by the actual information given by the evaluation environment.

  • ALGOPATH: will be replaced by the name of the file containing the configuration of the solver (including the path to the file). It is recommended that the solver uses this parameter.
  • BENCHPATH: will be replaced by the name of the file containing the instance to solve (including the path to the file). Obviously, the solver must use this parameter.
  • RANDOMSEED: will be replaced by a random seed which is a number. This parameter MUST be used to initialise the random number generator when the solver uses random numbers. It is recorded by the evaluation environment and will allow to run the program on a given instance under the same conditions if necessary.

Output Rules

The evaluation environment records everything that is output by your solver on stdout/stderr.

In the futute, it will be able to timestamp each line. This can be very informative to check how your solvers behaved on some instances.

Therefore solvers must output messages to the standard output and those messages will be used to check the results. The output format is inspired by the DIMACS output specification of the SAT competition and can be used to manually check some results. Lines output by the solver should be prefixed by ‘i’, ‘s’, ‘o ‘, ‘v ‘, ‘d ‘. Lines which do not start with one of these prefixes are considered as comment lines and are ignored. The meaning of these prefixes is detailed below.

Lines

There exist 5 different types of lines. They are defined as follows:

instance (‘i’ line)

These lines are mandatory and start with the two following characters: lower case i followed by a space (ASCII code 32).

solution (’s ’ line)

These lines are mandatory and start with the two following characters: lower case s followed by a space (ASCII code 32). These two characters are followed by one of the following answers:

  • SAT: all categories.
  • OPTIMUM: optimization category.
  • ALL: enumeration category.
  • UNSAT: all categories.
  • UNKNOWN: all categories.
  • TIMEOUT: all categories.
  • UNSUPPORTED: all categories.
  • ERROR: all categories

Any mistake in the writing of these lines will cause other scripts to disregard the answer. Solvers are not required to provide any specific exit code corresponding to their answer.

objective (‘o ’ line) (optimization only)

These lines start with the two following characters: lower case o followed by a space (ASCII code 32). These two characters are followed by one integer.

values (‘v ’ line)

These lines start with the two following characters: lower case v followed by a space (ASCII code 32) and followed by a solution of the problem.

diagnostic (‘d ’ line)

These lines are optional and start with the two following characters: lower case d followed by a space (ASCII code 32). Then, a keyword followed by a value must be given on this line.

comment (‘c ’ line)

Such lines are optional and start with the two following characters: lower case c followed by a space (ASCII code 32). These lines are optional and may appear anywhere in the solver output. They contain any information that authors want to output. They are recorded by the evaluation environment for later viewing but are otherwise ignored. Submitters are advised to avoid outputting comment lines which may be useful in an interactive environment but otherwise useless in a batch environment. For example, outputting comment lines with the number of constraints read so far only increases the size of the logs with no benefit.

Important remarks

Don’t forget to flush the output as soon as you have printed a ‘i’ line, or a ‘s ’ line or a ‘v ’ line.

Diagnostics

A diagnostic is a (name,value) pair which describes the work carried out by the solver. They have to be written to stdout as a ‘d ’ line. Each diagnostic is a line of the form ‘d NAME value’, where NAME is a sequence of letters describing the diagnostic, and value is a sequence of characters defining the its value.

Specific rules for satisfaction solvers

A CSP solver must output exactly one ‘s ’ line. These lines are not necessarily the first ones in the output since the CSP solver can output some ‘c ’ and ‘d ’ lines in any order. If the solver does not output a ‘s ’ line, or if the ‘s ’ line is misspelled, then UNKNOWN will be assumed.

Specific rules for enumeration solvers

Specific rules for optimization solvers

Since an optimization solver will not stop as soon as it finds a solution but instead will try to find a better solution, it must be given a way to output the best solution it found even when it reaches the time limit.

Here, we do not assume that the solver can not intercept signals from the evaluation environment.
First, you can configure the solver time limit so that it is compatible with the time limit of the evaluation environment (oar walltime). It can save some time as the solver avoids to output a certificate for each solution it found. It only outputs a certificate for the best solution which it was able to find. \ Second, you can output a ‘s ’ line with SATISFIABLE when the first solution is found, and a certificate ‘v ’ line each time you find a solution which is better than the previous ones accompanied (this is mandatory) with an ‘o ’ line. Only the last complete certificate will be taken into account. If eventually, your solver proves that the last solution that was output is optimal, then it must output ‘s OPTIMUM FOUND’. A solver with is aware oar walltime can output:

     o 19
     o 16
     o 1
     s OPTIMUM FOUND
     v 1 4 7 8 3 4

A solver which ignores the oar walltime may output for the same problem :

     c Got a first solution !
     s SATISFIABLE
     o 19
     v 1 1 1 1 1 1
     c Found a better solution
     o 16
     v 1 2 1 1 1 1
     c Found a better solution
     o 1
     v 1 4 7 8 3 4
     s OPTIMUM FOUND

Timestamp

The evaluation environment will automatically timestamp each of these lines so that it is possible to know when the solver has found a better solution and the cost of the solution. The goal is to analyse how solvers progress toward the best solution. The timestamped output will be for example:

     o 19 0.57
     o 16 1.23
     o 1 2.7
     s OPTIMUM FOUND 10.5
     v 1 4 7 8 3 4

The last column in this example is the time at which the line was output by the solver (expressed in seconds of wall clock time since the beginning of the program).

Workflow

Benchmark directory

The benchmark directory must contain the following files and directories :

  • solver.sh : A shell script compatible with the execution environment.
  • instances : an arborescence which contains all instances.
  • algorithms : a flat directory which contains all configuration files for the solver.

Execution

The script gridjobs :

  1. Submit the script solver.sh for each pair (ALGOPATH, BENCHPATH)
  2. Write the standard output in a file new directory results (results/ALGO/BENCHPATH where the BENCHPATH extension is replaced by ‘.o’).

Reporting

The script gridres agregates the results:

  • .sol files contain all ‘v’ lines
  • .res files contain table with a subset of ‘i’, ‘s’, ‘d’, and even ‘c’ lines specified -k argument.

Usecase

cd test
gridjobs -l
gridres -k keys.txt

About

Deployment of a benchmark on localhost or a cluster.

License:GNU General Public License v3.0


Languages

Language:Shell 63.2%Language:Python 29.5%Language:Makefile 7.3%