================== american fuzzy lop ================== Written and maintained by Michal Zalewski <lcamtuf@google.com> Copyright 2013, 2014 Google Inc. All rights reserved. Released under terms and conditions of Apache License, Version 2.0. For new versions and additional information, check out: http://code.google.com/p/american-fuzzy-lop/ 1) Background ------------- Fuzzing is one of the most powerful strategies for identifying security issues in real-world software. Unfortunately, it also offers fairly shallow coverage: it is impractical to exhaustively cycle through all possible inputs, so even something as simple as setting three separate bytes to a specific value to reach a chunk of unsafe code can be an insurmountable obstacle to a typical fuzzer. There have been numerous attempts to solve this problem by augmenting the process with additional information about the behavior of the tested code. These techniques can be divided into three broad groups: - Simple coverage maximization. This approach boils down to trying to isolate initial test cases that offer diverse code coverage in the targeted application - and them fuzzing them using conventional techniques. - Control flow analysis. A more sophisticated technique that leverages instrumented binaries to focus the fuzzing efforts on mutations that generate distinctive sequences of conditional branches within the instrumented binary. - Static analysis. An approach that attempts to reason about potentially interesting states within the tested program and then make educated guesses about the input values that could possibly trigger them. The first technique is surprisingly powerful when used to pre-select initial test cases from a massive corpus of valid data - say, the result of a large-scale web crawl. Unfortunately, coverage measurements provide only a very simplistic view of the internal state of the program, making them less suited for creatively guiding the fuzzing process later on. The latter two techniques are extremely promising in experimental settings. That said, in real-world applications, they frequently lead to irreducible complexity: most of the high-value targets will have a vast number of internal states and possible execution paths, and deciding which ones are interesting and substantially different from the rest is an extremely difficult challenge that, if not solved, usually causes the "smart" fuzzer to perform no better than a traditional one. 2) About AFL ------------ American Fuzzy Lop uses edge coverage to detect subtle, local-scale changes to program control flow without having to perform a complex comparisons between series of distinctive traces (a common failure point for other tools). In almost-plain English, the fuzzer does this by instrumenting the code to record tuples in the following format: <ID of current code location>, <ID of previously-executed code location> The ordering for each tuple is discarded and hit count is tracked only coarsely; the main signal used by the fuzzer is the appearance of a previously-unseen tuple in the output dataset. This method combines the self-limiting nature of simple coverage measurements with the sensitivity of control flow analysis. The instrumentation is used as a part of a simple queue-based algorithm: 1) Load user-supplied initial test cases into the queue, 2) Take input file from the queue, 3) Repeatedly mutate the file using a balanced variety of traditional fuzzing strategies, 4) If any of the generated mutations resulted in a new tuple being recorded by the instrumentation, add mutated output as a new entry in the queue. 5) If queue not empty, go to 2. The discovered test cases are also periodically culled to eliminate ones that have been made obsolete by more inclusive finds discovered later in the fuzzing process. Because of this, the fuzzer is useful not only for identifying crashes, but is exceptionally effective at turning a single valid input file into a reasonably-sized corpus of interesting test cases that can be manually investigated for non-crashing problems, or used to stress-test applications that are harder to instrument or too slow to fuzz efficiently. In particular, it can be extremely useful for generating small test sets that may be programatically or manually examined for anomalies in a browser environment. In real-world testing with libraries such as libjpeg, libpng, libtiff, or giflib, the fuzzer requires no fine-tuning, and significantly outperforms blind fuzzing or coverage-only tools. Using a common set of strategies against libjpeg, the tool toggles twice as many branches as non-instrumented fuzzing (IGNORE_FINDS in config.h) and identifies several times as many distinctive test cases than coverage-based algorithms (COVERAGE_ONLY in config.h). 3) Instrumenting programs for use with AFL ------------------------------------------ Instrumentation is injected by a companion tool called afl-gcc. It is meant to be used as a drop-in replacement for GCC, directly pluggable into the standard build process for any third-party code. The instrumentation has a relatively modest performance impact; in conjunction with other optimizations implemented by the tool, many instrumented programs can be fuzzed as fast or even faster than possible with traditional tools. The injected probes are designed to work with C and C++ code compiled on 32-bit x86 platforms. You can edit config.h and uncomment USE_64BIT to switch to 64-bit instrumentation. Porting to other platforms should not be difficult; in fact, there is an early-stake ARM port in experimental/arm_support/. The correct way to recompile the target program will vary depending on the specifics of the build process, but a common approach may be: $ export AFL_PATH=/path/to/afl/ $ CC=$AFL_PATH/afl-gcc ./configure $ make clean all For C++ programs, it may be necessary to specify this instead: $ CXX=$AFL_PATH/afl-g++ ./configure When testing libraries, it is essential to either link the tested executable against a static version of the instrumented library (./configure --disable-shared may be useful), or to use a proper binary (beware of shell script wrappers generated by the build system) with the right LD_LIBRARY_PATH. Setting AFL_HARDEN in the environment will cause afl-gcc to automatically enable several GCC hardening features that may make it easier to detect memory bugs; with GCC 4.8 and above, this includes ASAN / MSAN. 4) Choosing initial test cases ------------------------------ To operate correctly, the fuzzer requires one or more input file containing a normal, typical input normally processed by the targeted application. Whenever possible, the file should be reasonably small; under 1 kB is ideal, although not strictly necessary. There is limited value in providing multiple files that are not fundamentally different from each other, and exercise the same set of features. When in doubt, use fewer samples, not more. One test case is perfectly fine in most scenarios. If a large corpus of data is available for screening, the afl-showmap utility can be employed to compare the instrumentation data recorded for various inputs. Files that not produce any previously-unseen tuples can be usually rejected. The fuzzer also performs basic internal de-duplication on its own. 5) Fuzzing instrumented binaries -------------------------------- The fuzzing process itself is carried out by the afl-fuzz utility. The program requires an input directory containing one or more initial test cases, plus a path to the binary to test. For tested programs that accept data on stdin, the usual syntax may be: $ ./afl-fuzz -i input_dir -o output_dir /path/to/program [...params...] For programs that need to read data from a specific file, the appropriate path can be specified via the -f flag, e.g.: $ ./afl-fuzz [...] -f testme.txt /path/to/program testme.txt It is possible to fuzz non-instrumented code using the -n flag. This gives you a fairly traditional fuzzer with a couple of nice testing strategies. You can use -t and -m to override the default timeout and memory limit for the executed process, although this is seldom necessary. The fuzzing process itself is relatively simple. It consists of several types of sequential, deterministic operations (bitflips, injection of interesting integers) followed by a "havoc" stage with multiple stacked, random modifications - block deletion, cloning, random bitflips, etc. The deterministic stage takes time proportional to the size of the input file; once this is done, the havoc stages continue for every discovered input until Ctrl-C is hit. For large inputs, you can use -d to skip the deterministic stages and proceed straight to random tweaks. Some other tips for optimizing the performance of the process are discussed in the perf_tips.txt file included with the source code of AFL. 6) Interpreting output ---------------------- The fuzzer keeps going until aborted with Ctrl-C or killed with SIGINT or SIGTERM. The progress screen provides various useful stats, including the number of distinctive execution paths discovered, the current queue cycle, the number of crashes recorded, and the number of execve() calls per second. For more info about the displayed data, visit this URL: https://code.google.com/p/american-fuzzy-lop/wiki/StatusScreen There are three subdirectories created within the output directory: - queue/ - test cases for every distinctive execution path, along with hard links to any initial input files. The data is essentially a corpus of interesting test cases valuable for seeding other, more resource-intensive testing steps. The directory can be also used to resume aborted jobs; simply do: ./afl-fuzz -i old_output_dir/queue -o new_output_dir [...] - hangs/ - test cases that cause the tested program to time out. The entries are grouped by a 32-bit hash of the execution path. - crashes/ - test cases that caused the tested program to receive a fatal signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are grouped by the received signal, followed by the hash of the execution path. In all three directories, the first segment of every file name is a sequential ID of the generated test case, and the second one corresponds to the "parent" queue entry that the entry is derived from; there's some other fairly self-explanatory metadata appended, too. Although the fuzzer does not perform any additional analysis of the discovered crashes, the path-based grouping makes it easy to triage new finds manually - or to examine them with a simple GDB script. One such script is provided in experimental/crash_triage/. 7) Parallelized fuzzing ----------------------- For tips on how to fuzz a common target on multiple cores or multiple networked machines, please refer to the parallel_fuzzing.txt file included with the source code of American Fuzzy Lop. 8) Known limitations & areas for improvement -------------------------------------------- Here are some of the most important caveats for AFL: - The fuzzer is optimized for compact binary data formats, such as images and other multimedia. It is not well-suited for verbose, human-readable formats such as XHTML or JavaScript. In such cases, template- or ABNF-based generators tend to fare better. (The fuzzer can still give a good workout to the first line of XML or JS parsing, just won't be able to guess higher-order syntax particularly well.) If you want to modify the code to generate syntax-aware mutations, you'd need to start with fuzz_one() in afl-fuzz.c. - The fuzzer offers limited coverage if encryption, checksums, cryptographic signatures, or compression are used to wholly wrap the actual data format to be tested. Good solutions include manually commenting out the relevant checks in the instrumented application, or using a wrapper that postprocesses the fuzzer-generated data before feeding it to the target program. As an example, a patch for libpng to bypass CRC checksums is provided in experimental/libpng_no_checksum/libpng-nocrc.patch. - The included instrumentation (afl-as.h) currently supports 32-bit x86 code by default, with the option to go 64-bit by uncommenting USE_64BIT in config.h. If you are feeling adventurous, an experimental ARM port can be found in experimental/arm_support/, too. For other CPUs, rewriting the assembly code in afl-as.h should be a very simple task. Note that minor tweaks to fuzz_one() in afl-fuzz.c may be necessary to accommodate architectures that require aligned memory access. - The approach is theoretically compatible with any GCC-supported language, but only C and C++ were confirmed to work. Objective C is very likely to work; reports for other languages (e.g. GCJ Java) are welcome. - Instrumentation of binary-only code is theoretically possible, but not supported at this point. Leveraging pin or DynamoRIO may be a good approach. 9) Contact ---------- Questions? Concerns? Bug reports? The author can be usually reached at <lcamtuf@google.com>.