Amocy-Wang / ProbDD

Here is the replication of the paper Probabilistic Delta Debugging, which has been accepted by ESEC/FSE 2021. More details can be found in README.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

                        Probabilistic Delta Debugging (ProbDD)
                            --- an instance of delta debugging
                            
          Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong and Lu Zhang
         
            Department of Computer Science and Technology, Peking University
                College of Intelligence and Computing, Tianjin University

Please send issues and suggestions to guancheng.wang@pku.edu.cn.

Please see our original paper to find most ProbDD info.

Please see Section Appendix for proofs of the theorems in our paper.

ProbDD

Maintenance License Environment C++ 11 Python 2 platform

Here is the artifact of the paper Probabilistic Delta Debugging, which has been accepted by ESEC/FSE 2021. The following table shows the important files and their purposes in this artifact, which may help you use the artifact with a good experience.

File name Purpose
LICENSE description of the distribution rights
README guidance on how to read the documentation
STATUS the badges we are applying for and the reasons
CHANGES code changes compared with the original tools
REQUIREMENTS requirements needed with building from source
INSTALL installation guidance needed with building from source

The artifact has two main purposes, and this file states the badges we are applying for as well as the reasons why we believe that the artifact deserves the badges.

Reproducing the main results


In this section, we show how to reproduce the main results in our paper step by step, i.e., preparing, running tools, and analyzing results. If you want to rapidly do this, please directly jump to the last part An example workflow in this section.

Preparing

There are two ways to run ProbDD and reproduce the results on your machines.

  • Recommended Downloading Docker Image. For macOS or Linux users, you can follow the following commands to get into the container. For Windows users, you need to download docker desktop and Cygwin first and make sure that there is no error message when the docker destop starts for the first time. Then, you need to open Cygwin and run the commands below. If there is any error messages, please follow the pop-up link to fix the problems. Usually, the problem is that BIOS disables cpu virtualization function. You can fix it by entering BIOS, enabling the function, and restarting.
# Option 1: download the image package ProbDD.tar (about 61G) and generate Image ProbDD.tar. You can download the split files of Docker Image from <https://pan.baidu.com/s/1DEmnbNJdqpvuvvMSZecF3A> with password "pj9b".
cat ProbDD.tar.part-* > ProbDD.tar
# Option 2: You can also directly download the Image package ProbDD.tar.zip (about 18G) from the shared link <https://pan.baidu.com/s/1GPgwrV6HNF8v4zXEUfbB-A> with password "uhyi" or the shared link <https://www.icloud.com/iclouddrive/0DmUGMLPbyl1RxS6v0FlgWKZQ#ProbDD.tar>.
unzip ProbDD.tar.zip
# load as a Docker Image
docker load < ProbDD.tar # This process may take about 10 minutes depending on the machine performance. When the process finishes, the image id will show on the screen.
# get into the container
docker run --privileged -it [image id] /bin/bash 
  • Recommended Building from Source Code. The installation guidance is successfully tested on a machine with Ubuntu 16.04.7 LTS. The requirements are listed in REQUIREMENTS. You can successfully build the tools and reproduce the evaluation by following the installation guidance in INSTALL. We DO NOT recommend building from source because there are many dependencies needed by the tools and the subjects of benchmarks.

Running tools

We also provide several scripts for running our tools. The used datasets in our paper and this artifact can be found at compilerbugs and debloating, respectively. The following table shows these scripts and their functions.

Script name Function
compilerbugs/runChisel_ProbDD evaluating Chisel with ProbDD with dataset compilerbugs
compilerbugs/runChisel_activecoarsen evaluating Chisel with ActiveCoarsen with dataset compilerbugs
compilerbugs/runChisel_ddmin evaluating the original Chisel with dataset compilerbugs
compilerbugs/runHDD_ProbDD evaluating HDD with ProbDD with dataset compilerbugs
compilerbugs/runHDD_activecoarsen evaluating HDD with ActiveCoarsen with dataset compilerbugs
compilerbugs/runHDD_ddmin evaluating the original HDD with dataset compilerbugs
chisel-bench/runChisel_ProbDD generating a clean working space and evaluating Chisel with ProbDD with dataset chisel-bench
chisel-bench/runChisel_activecoarsen generating a clean working space and evaluating Chisel with ActiveCoarsen with dataset chisel-bench
chisel-bench/runChisel_ddmin generating a clean working space and evaluating the original Chisel with dataset chisel-bench

To process subjects, you need:

  • Get into the working directory where the scripts are located. If you use the Docker Image, the working directories are /benchmarks/compilerbugs and /benchmarks/chisel-bench for dataset compilerbugs and chisel-bench, respectively.
  • run the scripts
# Option1: process subjects in compilerbugs
cd /benchmarks/compilerbugs
# If you want to run Chisel based tools, here we take an example for using Chisel with ProbDD.
./runChisel_ProbDD.sh
# If you want to run HDD based tools, here we take an example for using HDD with ProbDD.
./runHDD_ProbDD.sh

#Option2: process subjects in chisel-bench. The default running tool is chisel_ProbDD, if you want to change the tool, you need to modify the tool path in /benchmarks/chisel-bench/benchmark/target.mk (line 9).
cd /benchmarks/chisel-bench
# Run the tool. Here we take an example for using Chisel with ProbDD.
./runChisel_ProbDD.sh

Analyzing results

We provide several scripts to conveniently collect results and generate a report for each subject processed by the tools. The scripts are written in Python-2.7, so you need to run them with Python-2.7. The following table shows these scripts and their functions. The first and the third scripts need a parameter, which is the name of the log file.

Script name Function
compilerbugs/generate_ChiselReport should be invoked after executing the script compilerbugs/runChisel_XXX to collect token number and processing time for each subject in compilerbugs processed by Chisel based tools
compilerbugs/generate_HDDReport should be invoked after executing the script compilerbugs/runHDD_XXX to collect token number and processing time for each subject in compilerbugs processed by HDD based tools
chisel-bench/generate_ChiselReport should be invoked after executing the script chisel-bench/runChisel_XXX to collect token number and processing time for each subject in chisel-bench processed by Chisel based tools

An example workflow

In this part, we take an example for using ProbDD to process a subset of subjects in both datasets and generating the report on the returned number of tokens and the processing time for each processed subject. For demonstration, we add a prefix 'workflow-' to the corresponding scripts. If you want a complete validation and verification (this may take a very long time), just follow the steps without this prefix.

# Step1: process subjects in compilerbugs by using Chisel with ProbDD
cd /benchmarks/compilerbugs/
./workflow-runChisel_ProbDD.sh # A set of four subjects will be processed. 
python workflow-generate_ChiselReport.py workflow-chisel_ProbDD.out # A file named resfile.csv will be generated in the working directory after executing this command.

# Step2: process subjects in compilerbugs by using HDD with ProbDD
./workflow-runHDD_ProbDD.sh
python workflow-generate_HDDReport.py # The results will be printed onto the screen. If a subject is processed with time out, the value of time field will be empty.

# Step3: process subjects in chisel-bench by using Chisel with ProbDD
cd /benchmarks/chisel-bench/
./workflow-runChisel_ProbDD.sh
python workflow-generate_ChiselReport.py workflow-chisel_ProbDD.out # The results will be printed onto the screen. If a subject is processed with time out, the value of time field will be empty.

Being used for delta debugging tasks


Next, we show that how ProbDD can be used for delta debugging tasks beyond the evaluation dataset. When a set of elements and a test function is provided, it reduces the elements to a smaller set. We also prepare a set of subjects (which are not contained in the evaluation dataset) as examples.

To add a new subject, one needs the following

  • entering a directory, where you want to put your subject (assuming /examples/new/ for an example)
  • a C program (named small.origin.c for an example)
  • a script specified the properties that the given C program exhibits (named test.sh for an example)

Do the following to run the new subject with Chisel with ProbDD and HDD with ProbDD:

# First, check whether the given program exhibits the properties in the script.
cd /examples/new
cp small.origin.c small.c # backup
./test.sh
echo $? # If the given program does exhibit the properties, 0 should be returned.

# Step1: Run with the tool Chisel with ProbDD
/chisel_ProbDD/build/bin/chisel --skip_local_dep --skip_global_dep --skip_dce test.sh small.c # The returned file is small.c, and the returned number of tokens and the processing time are shown in the end of the log.

# Step2: Run with the tool HDD with ProbDD
/hdd_ProbDD/anaconda2/bin/picireny -i small.c --test test.sh --srcml:language C --grammar C.g4 --start compilationUnit --disable-cleanup # The returned file is small.c, and the processing time are shown in the end of the log. A directory named with prefix small.c.2021 will be generated and the returned C program is saved in it.

# Step2-1: Get the returned number of tokens by HDD with ProbDD
/countTokens/build/bin/chisel small.c.2021-XXX/small.c small.c.2021-XXX/small.c # The directory small.c.2021-XXX is generated in Step2.

Acknowledgement


We thank all the reviews for their thoughtful comments and efforts towards improving our paper.

We benifted a lot from the following projects when building ProbDD.

  • Delta Debugging: a homepage of Delta Debugging project maintained by Prof. Andreas Zeller.
  • Chisel: a system for Debloating C/C++ Programs. Proposed in the paper Effective Program Debloating via Reinforcement Learning.
  • Picireny: an instance of Hierarchical Delta Debugging Framework (HDD).

About

Here is the replication of the paper Probabilistic Delta Debugging, which has been accepted by ESEC/FSE 2021. More details can be found in README.

License:GNU General Public License v3.0


Languages

Language:C 67.3%Language:Makefile 9.3%Language:Shell 7.7%Language:M4 6.8%Language:Perl 2.9%Language:TeX 2.4%Language:Roff 1.6%Language:C++ 0.9%Language:Python 0.5%Language:Yacc 0.3%Language:Logos 0.1%Language:ANTLR 0.1%Language:Java 0.0%Language:XSLT 0.0%Language:CMake 0.0%Language:Awk 0.0%Language:RPC 0.0%Language:Assembly 0.0%Language:sed 0.0%Language:CSS 0.0%Language:Dockerfile 0.0%Language:Batchfile 0.0%Language:SWIG 0.0%