kspalaiologos / bzip3

A better and stronger spiritual successor to BZip2.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fuzzing with afl quickly reveals dozens of bugs

skeeto opened this issue · comments

The existing interface makes the program fuzzable without changes, and a few minutes of fuzzing reveals dozens of unique bugs in bzip3. Rather than report them individually, here's how you can find them yourself:

$ afl-gcc -m32 -g -fsanitize=address,undefined -O -Iinclude src/*.c
$ mkdir in
$ echo -n BZ3v1 >in/a
$ afl-fuzz -m 800 -i in -o out -- ./a.out -d

Once it starts reporting crashes, those inputs are in out/crashes/.

Thanks! I've used AFL a few times, and it has indeed revealed a few bugs, but I'm planning to go on a next session of fuzzing soon.

c6d619b seems to have fixed a fair amount of segfaults. the remaining UB lies in unaligned LZP accesses i won't fix and in libsais - we'd have to pester Ilya to fix them: IlyaGrebnov/libsais#10

Having ran AFL for a while now I can't stumble upon more segfaults; if you find them please send me the crashing data.

Also, -m 800 is going to run out of memory fairly often making AFL think that the program crashed - that's because the max block size is 511MB, so the maximum memory usage is somewhere around 2.7 GiB.

Since it's been awhile, and this is an interesting project, I thought I'd revisit. I ran it like so on 539278b:

$ afl-gcc -g3 -fsanitize=address,undefined -DVERSION='""' -Iinclude src/*.c
$ mkdir i
$ echo hello | ./a.out >i/x
$ afl-fuzz -m32T -ii -oo ./a.out -d

While dead simple, this is quick-and-dirty and a highly inefficient way to fuzz the decompresser. Despite this, the following buffer overflow popped out after a few seconds:

$ echo QlozdjEAAAABDgAAAAYAAEAtzyeN/////2hlbGxvCg== | base64 -d | ./bzip3 -d >/dev/null
Segmentation fault

Findings in the other direction (which floods the crash set with variations and make it difficult to find more):

$ echo JQQ8bwCMNv/wdjb//m9v////gElvAG8jtyUAPG8ApSUANv/wdjb//m9v////gElvAG8jtyUAPG8ApSUAPG8AnA== | base64 -d | ./a.out >/dev/null
include/libsais.h:1360:87: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'

$ echo AkxEQ5c+gD4+0z4+Pj4A/+XuRjU+Pj4+Pj4+WT8+ZFYfACk+/////z5OKTRk4+X/RDU+ZFZWPj5vVgEAKT7/////Pg== | base64 -d | ./a.out >/dev/null
src/libbz3.c:108:25: runtime error: load of misaligned address 0x7fc6ce8fe816 for type 'u32', which requires 4 byte alignment

@skeeto Unaligned accesses are present in the code and I don't consider them a serious problem. The remaining batch of issues is primarily caused by libsais (#59), which the author refuses to fix.
As for the example you point out, I can't reproduce it, so I would be thankful if you investigated it on your machine:

 1 [18:42] ~/workspace/bzip3@master % gcc src/*.c -Iinclude -g3 -o bzip3 "-DVERSION=\"0.0.0\"" -fsanitize=address -fsanitize=undefined
src/libbz3.c: In function ‘lzp_decode_block’:
src/libbz3.c:183:38: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  183 |                 if (oe > out_end) oe = out_end;
      |                                      ^
 0 [18:43] ~/workspace/bzip3@master % ./bzip3 -df data.txt.bz3
Write error: Bad address

In some cases fwrite will convert the buffer overflow into an error because it detects the issue (in the kernel?) before dereferencing. I see the error instead of the segfault when run under GDB. If you put a breakpoint on xwrite you can see the bogus write size.

$ gdb ./bzip3 
Reading symbols from ./bzip3...
(gdb) b xwrite
Breakpoint 1 at 0x1841ec: file src/main.c, line 79.
(gdb) r -d <crash.bz3 >/dev/null
Starting program: ./bzip3 -d <crash.bz3 >/dev/null
Breakpoint 1, xwrite (data=0x7fffed9ed800, size=1073741830, len=1, des=0x7ffff6c536a0 <_IO_2_1_stdout_>) at src/main.c:79
79          if (fwrite(data, size, len, des) != len) {
(gdb) p/x malloc_usable_size(data)
$1 = 0x1051ed8
(gdb) p/x size
$2 = 0x40000006
(gdb) p/x len
$3 = 0x1