izycheva / pine

Pine tool. Detailed description of the technique can be found in the paper "Counterexample- and Simulation-Guided Floating-Point Loop Invariant Synthesis" A.Izycheva, E.Darulova, H.Seidl, SAS'20

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pine is a tool for synthesizing sound inductive invariants for floating-point numerical loops. Pine generates invariants of the form of polynomial inequalities, which tightly bound the values of loop variables.

The README contains the following parts:

  • System requirements

  • Getting started

  • How to run Pine

    • Reproducing Table 1 (takes between approx. 20 minutes (for precomputed volumes of SMT-AI,Pilat) and 1.5 hours (if all volumes are recomputed))
    • Run Pine on a single benchmark (est. time < 2 minutes)
    • Reproducing Figure 4 and Table 5 (takes > 6 hours)

Installation

  • Pine requires Python3 and Java (has been tested with Python 3.7, 3.9, 3.10 and Java 1.8).

  • To install the dependencies run pip3 install matplotlib numpy scipy scikit-learn z3-solver. In the requirements.txt file you can see the versions of the libraries that were used originally during the development.

  • For running the experiments using our scripts you will need to install GNU Parallel

  • To run the tool using our scripts, follow the instructions below. All scripts must be executed from the repo home directory (to be changed).

Goals of the artifact (for SAS 2020 AEC)

In our paper, we evaluated the following about the tool Pine (Section 5):

  1. Comparison with State-of-the-Art: we generated invariants using Pine and compared volumes of generated invariants' shapes. The results are in Table 1 (columns 4,6 for Pine).

  2. Efficiency: we evaluated Pine's running time and presented the results in Table 1 (column 5).

  3. Sensitivity: we evaluated the usefulness of Pine's parameters (Section 5.5).

In support of these results, this artifact reproduces Table 1, Figure 4, and Table 5. We provide parameter sensitivity experiment for completeness, however, note that reproducing full results for Figure 4 and Table 5 takes over 6 hours.

Running Pine

Run Pine On Your Program (WIP to improve)

Copy your program to the benchmarks directory. To use the default configurations of Pine, run Pine by typing: python3 src/exp_use_default.py filename-without-dot-txt

Reproducing Table 1

Other tools (columns 2-3)

We provide precomputed volumes for the tools SMT-AI and Pilat. To use these simply proceed to the next step (reproducing columns 4-6).

If you want to compute the volumes from the invariants, run the script ./other_tools.sh [N] from the repository home directory. [N] is an optional parameter for the number of samples used in volume computations. By default N=10^6 and expected running time is 80 minutes. You may use a smaller number for N, which will decrease both running time and volume accuracy.
NOTE:

  • The script will recompute the volume for all inductive invariants generated by SMT-AI and Pilat. The entries undef, PF and TO will be replaced by - as the script does not distinguish different reasons of why there is no inductive invariant.
  • Precomputed volumes can be found in the file out/src_tools/compare_volumes.csv.
  • Invariants generated by SMT-AI and Pilat are saved in the files out/src_tools/smtai_invariants.csv and out/src_tools/pilat_invariants.csv.
  • For large volumes (> 100) 10^6 samples are insufficient for the accuracy up to 2 digits after the decimal point. However, the volumes of such magnitude are outliers and only observed for few invariants (mostly generated by SMT-AI). Therefore the reduced accuracy does not affect the qualitative result of comparison.

Pine (columns 4-6)

In the repo home directory run the script ./pine_table1.sh [N]. Use N to change a number of samples used in the volume computation. By default the script computes an invariant's volume with 10^6 samples, and the volume is accurate to 2 decimal points. The script should finish in about 20 minutes. This will reproduce columns 4-6 of the Table 1 from the paper. The result will be printed in the terminal with the following header:

------------------------------------------------------------------------------------------------------------------------
|                              |          |         || Average |   Variation    |  Average invariant  ||  # invariants |
|          Benchmark           |  SMT-AI  |  Pilat  || volume  |   in volume    |  generation time, s ||  generated    |
------------------------------------------------------------------------------------------------------------------------

NOTE:

  • The script will run Pine on all benchmarks 4 times and compute average volume, its variation ((max-min)/avg), and average time (in seconds). Last column presents how many times (out of three runs) Pine successfully computed a floating-point invariant. This last column is not a part of the Table 1 in the paper.
  • To view the results in a new tab of the terminal navigate to the Pine directory and type python3 src/parse_results.py. Use the same command if you want to see intermediate results (before the script terminates).
  • Due to the randomness of simulations and counterexamples returned by Z3 the results will not match exactly the numbers in the paper. However, the magnitude of volumes should be comparable.
  • Additionaly, due to the randomness, it is normal that for some benchmarks Pine may not generate an invariant in the 4 runs we make.
  • The script uses GNU Parallel to parallelize the experiment by running Pine on several benchmarks at the same time. Increasing the amount of CPUs in the VM settings will speed up the computations.

Running Pine on a single benchmark

To generate an invariant for a single benchmark go to the Pine directory and type ./pine_single.sh. This will print the list of available benchmarks. Give a benchmark name as an argument to the script. For example, ./pine_single.sh harmonic 1000 will find an invariant for the harmonic oscillator (in file benchmarks/harmonic.txt) and will compute the invariant's volume using 1000 samples. The result will be printed in the terminal and recorded in the file out/{benchmark-name}_res.csv. NOTE:

  • The script does not print any intermediate information. Once you execute the script, wait approx. 1-2 minute(s) to see the results printed.
  • Example output:
$ ./pine_single.sh harmonic 1000
RESULTS for harmonic
Volume: 3.91
Took 6.11 seconds to generate the invariant. Confirmed for FP: OK.
Ranges: {'x1': (-0.5, 1.8), 'x2': (-1.0, 1.0)}
Invariant: -1.0*x1 + -0.49*x2 + 0.78*x1^2 + 0.44*x1*x2 + 0.76*x2^2 <= 0.57

Reproducing Figure 4 and Table 5

NOTE: Due to the running time longer than 6 hours we do not recommend to reproduce full results! In order to reduce the running time comment out some benchmark names in the script sensitivity.sh before starting the experiment.

To perform sensitivity experiment from the repository home directory run ./sensitivity.sh [N], where [N] is an optional parameter. This will start an exploration of different parameters. Pine will run on each benchmark 1296 times (once per configuration) and compute volume of generated invariants. By default the volume is computed with 10^6 samples, to reduce the running time (and accuracy of the volume) use a smaller number as an argument [N] when starting the script.

The script will automatically parse results of the sensitivity experiment. First, it creates the Figure 4. The following line indicates that the figure has been created: ========== Figure 4 is ready! Type `evince out/results_sensitivity/figure4.pdf` ========== To see the figure, type evince out/results_sensitivity/figure4.pdf. Due to the randomness, the exact proportion of parameters might vary.

NOTE: If you are using an incomplete set of benchmarks, the proportions might shift towards what is best for the benchmarks you ran the experiment for.

When the script is finished it creates Table 5. To see the results type cat out/results_sensitivity/table5.csv | column -t -s,. The numbers in the generated table will not match exactly those presented in the paper in Table 5. However, orders of magnitude of volumes should be comparable.

Notes and remarks

As mentioned in Section 5 of the paper, for estimate of the shape's volume it uses Monte-Carlo Simulation.

If you want to try out different parameters of Pine, you can change them directly in the python script src/exp_use_default.py (lines 10-18). Then use the scripts ./pine_table1.sh and ./pine_single.sh to run Pine with the new parameters.

*Icon made by Freepik taken from Flaticon.

About

Pine tool. Detailed description of the technique can be found in the paper "Counterexample- and Simulation-Guided Floating-Point Loop Invariant Synthesis" A.Izycheva, E.Darulova, H.Seidl, SAS'20


Languages

Language:Python 96.3%Language:Shell 3.7%