RUB-SysSec / cupid

Cupid: Automatic Fuzzer Selection for Collaborative Fuzzing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cupid

This is the code for our ACSAC 2020 paper: "Cupid: Automatic Fuzzer Selection for Collaborative Fuzzing".

About

The idea behind Cupid is to automatically collect data on how well different fuzzers perform on a diverse set of binaries and use this data to predict which combination of fuzzers will perform well when executed in collaboration (i.e. in parallel - also called ensemble fuzzing).

In prior research, EnFuzz has shown that in collaborative fuzzing scenarios there is a difference in performance between multiple instances of the same fuzzer and using a diverse set of fuzzers. We expand on this idea by avoiding the human expert guidance that was necessary to select the diverse set of fuzzers - instead, we use an automatic, data-driven approach to predict which fuzzers will complementary.

In Cupid, we basically:

  • Build docker images for every fuzzer (e.g. AFL, FairFuzz, Honggfuzz, etc.)

  • Let all of them run in isolation (i.e. not in parallel) on a set of binaries, for a limited time period

  • Use different seeds to explore more of the program space of every binary

  • As randomness is an inherent property of fuzzing, we need to do the above step many times (e.g. 30 runs for every fuzzer+binary+seed combination)

  • Collect data on which branches the fuzzers were able to solve and how often

  • Use our complementarity metric as outlined in the paper to calculate which fuzzers would profit from working with any of the other fuzzers in a collaborative run, i.e., how well a combination of fuzzers would complement each other

  • Make a prediction on which combination of fuzzers should be used in future collaborative runs on any binary - where the quality of the prediction depends on the quality of the training data and how representative the binaries are of unknown real-world binaries

Per default, Cupid comes with these fuzzers:

¹It's not really lafintel, as Google's fuzzer-test-suite did not build with the LLVM passes, so it's just an old AFL++ version with compcov instead.

²We have forked and extended LibFuzzer and Honggfuzz to support AFL-style exchange of corpus seeds. To our understanding, this is the first cross-fuzzer implementation of corpus synchronisation between afl-based fuzzers and LibFuzzer/Honggfuzz in both directions.

³This is AFL++ in Radamsa mode.

For specific version numbers, please refer to our paper.

Usage

Attention: Please note that in some rare occasions, Cupid has to harshly terminate some fuzzers, delete their directories and forcefully remove files, so please only use Cupid on a machine where no important data can be lost (i.e. a test machine or a virtual machine).

As Cupid needs to build docker images for all fuzzers and let each of them build their own binaries (to avoid problems with different instrumentation methods), building all images can take up to 100GB of disk space. There is room for improvement here, e.g., some of the fuzzers could share the same binaries etc. But as of yet, no such fix is planned, so this is the only way to build the images right now. But you can jump to our artifact evaluation section below to find out how to remove some of the fuzzers and binaries if you don't have enough space.

Build

You need Python 3 (we've tested the code with v3.6.9, you should have at least the same version because we use some Python futures that are unavailable in older versions). And you need to install screen (Ubuntu 18.04 example):

$ sudo apt install python3 screen

We have to install some Python packages:

$ python3 -m pip install python-ptrace oslo_concurrency

And then we need to build and install our custom Python package that is used to quickly track branch coverage (QuickCov):

$ git clone https://github.com/egueler/quickcov.git
$ cd quickcov
$ ./build.sh
# check if it works:
$ python3 -c "import quickcov"

Now go back to Cupid.

In the first step, build all the necessary images by calling:

$ ./docker/build.sh

The script should abort in case of error. This may take several hours to complete.

Run

If everythings works fine, you can start the fuzzing process by calling control.py:

$ python3 control.py --timeout 60 --fuzzers afl,libfuzzer --binary base64

Which starts two fuzzers in parallel (AFL and LibFuzzer) and let's them fuzz LAVA-M's base64 for sixty seconds. Note that the fuzzing output directory is set to /dev/shm/sync{random_name} so make sure that you have enough memory for longer runs. When the run is done, it dumps a pickle file containing all relevant information (mode, binary, plot information, branch coverage, etc.). If you want the output directory to be removed automatically (i.e. you don't want to keep the corpus), you can set CLEANUP=1 as an environment variable before running control.py.

Explanation for the parameters:

--fuzzers can be either a comma-separated list of fuzzers (allowed values are qsym, afl, aflfast, fairfuzz, libfuzzer, honggfuzz, radamsa, lafintel) which we call the custom-mode, or any of our other modes: enfuzz, enfuzz-q, cupid, which pre-selects the four fuzzers described in the paper.

--binary can be any of these values: base64, md5sum, who, uniq, boringssl, c-ares, freetype2, guetzli, harfbuzz, json, lcms, libarchive, libjpeg-turbo, libpng, libssh, libxml2, llvm-libcxxabi, openssl-1.0.1f, openssl-1.0.2d, openssl-1.1.0c, openthread, pcre2, proj4, re2, sqlite, vorbis, woff2, wpantund, where the first four binaries are from LAVA-M (base64, md5sum, who, uniq) and the rest is from Google's fuzzer-test-suite.

Parse Pickle file

Now you can parse the plot-[rand].pickle pickle file that was dumped to the Cupid directory, to use the data however you like, generate a plot etc. There is also a timestmaps-[rand].pickle file which has all the creation dates for the queue files. We've provided an example dump.py to display the content of these files:

$ python3 dump.py plot-123.pickle

Stop

Cupid will stop automatically once timeout is reached. If you abort the control.py run prematurely, you've aborted the cleanup stage. To cleanup all docker containers and screen sessions which might still be running in the background, run:

$ python3 stop.py

Artifact Evaluation

Please refer to our artifact evaluation page here for more information.

Citation

@article{cupid,
  title={Cupid: Automatic Fuzzer Selection for Collaborative Fuzzing},
  author={G{\"u}ler, Emre and G{\"o}rz, Philipp and Geretto, Elia and Jemmett, Andrea and {\"O}sterlund, Sebastian and Bos, Herbert and Giuffrida, Cristiano and Holz, Thorsten}
  booktitle = {Annual Computer Security Applications Conference (ACSAC)},
  doi = {10.1145/3427228.3427266},
  year = {2020}
}

About

Cupid: Automatic Fuzzer Selection for Collaborative Fuzzing

License:GNU Affero General Public License v3.0


Languages

Language:C 88.2%Language:Python 5.6%Language:Makefile 4.0%Language:Dockerfile 1.8%Language:Shell 0.3%Language:CMake 0.2%