afl fuzzing for tree-sitter.
This project focuses on fuzzing the tree-sitter runtime and associated parsers for each language tree-sitter supports. It does this through a small set of test harnesses, which are C programs—one for each language—that take an input file and (try to) parse it. The test harness, tree-sitter, and the language parsers are all compiled with afl-clang
and hardening, after which fuzzing is performed with afl-fuzz
.
script/bootstrap
script/setup-ramdisk # Optional, but recommended b/c afl is hard on SSDs.
cd /Volumes/ramdisk/
./fuzz javascript
An incomplete list of interesting bugs found using afl-fuzz
.
- Handling of invalid UTF8 characters. Not checking return value of
utf8proc_iterate
which then sets a -1codepoint_ref
and causes the parser to hang. Fixed in tree-sitter/tree-sitter@f394a48 and test added in tree-sitter/tree-sitter@7092d45. - Infinite loop due to which stack versions are selected for halting. Fixed in tree-sitter/tree-sitter@a94742a .
- Infinite loop in external scanner if closing quote is never found due to failure to also check for EOF. Fixed in tree-sitter/tree-sitter-ruby@d5ed995.
You can fuzz in parallel to take full advantage of multi-core systems. See script/fuzz
for specific options passed to afl-fuzz
and for language setup.
# Fuzz in parallel with 1 primary and 3 secondary fuzzers.
./fuzz -p -n 3 javascript