sile / libflate

A Rust implementation of DEFLATE algorithm and related formats (ZLIB, GZIP)

Home Page:https://docs.rs/libflate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Buffer overflow when encoding (detected by address sanitizer)

kennytm opened this issue · comments

Repro steps:

[dependencies]
libflate = "0.1.19"
use libflate::lz77::*;
fn main() {
    let mut enc = DefaultLz77Encoder::new();
    let mut sink = Vec::new();
    enc.encode(b"aaaaa", &mut sink);
    enc.flush(&mut sink);
}
$ cargo +nightly rustc -- -Zsanitizer=address
$ target/debug/a
Expected to finish without any messages. However, got an ASAN error instead.
=================================================================
==45039==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000155 at pc 0x00010334529b bp 0x7ffeec8ea850 sp 0x7ffeec8ea848
READ of size 1 at 0x602000000155 thread T0
    #0 0x10334529a in libflate::lz77::default::prefix::hbc520ed13fc89540 default.rs:114
    #1 0x10333a510 in _$LT$libflate..lz77..default..DefaultLz77Encoder$u20$as$u20$libflate..lz77..Lz77Encode$GT$::flush::hccf4bdcba320d36a default.rs:89
    #2 0x103342b34 in a::main::h655e7f278e236686 main.rs:6
    #3 0x10332498d in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hba64913d4b77ef20 rt.rs:64
    #4 0x103360c47 in std::panicking::try::do_call::h6e8c55b5404ca92b panicking.rs:297
    #5 0x10336322e in __rust_maybe_catch_panic lib.rs:92
    #6 0x10336160d in std::rt::lang_start_internal::h616fb8704166427f rt.rs:48
    #7 0x1033248fe in std::rt::lang_start::h75c0523eec08eff1 rt.rs:64
    #8 0x103342d11 in main (a:x86_64+0x10002ed11)
    #9 0x7fff6363eed8 in start (libdyld.dylib:x86_64+0x16ed8)

0x602000000155 is located 0 bytes to the right of 5-byte region [0x602000000150,0x602000000155)
allocated by thread T0 here:
    #0 0x1034384c3 in wrap_malloc (lib__rustc__clang_rt.asan_osx_dynamic.dylib:x86_64+0x594c3)
    #1 0x103350076 in alloc::alloc::alloc::h53fc6441db79abe3 alloc.rs:72
    #2 0x10334ff60 in _$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Alloc$GT$::alloc::hd3d2d4d4a708d0da alloc.rs:148
    #3 0x10334ec7b in _$LT$alloc..raw_vec..RawVec$LT$T$C$$u20$A$GT$$GT$::reserve_internal::h53e5cbdb4cc68b7c raw_vec.rs:668
    #4 0x10334fc58 in _$LT$alloc..raw_vec..RawVec$LT$T$C$$u20$A$GT$$GT$::reserve::h9bfdae6bf8258c89 raw_vec.rs:491
    #5 0x10334cf8f in _$LT$alloc..vec..Vec$LT$T$GT$$GT$::reserve::h6fc500c442273c7e vec.rs:457
    #6 0x10334c930 in _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$alloc..vec..SpecExtend$LT$$RF$$u27$a$u20$T$C$$u20$core..slice..Iter$LT$$u27$a$C$$u20$T$GT$$GT$$GT$::spec_extend::h022b22ee3d0d61a3 vec.rs:1906
    #7 0x10334cc84 in _$LT$alloc..vec..Vec$LT$T$GT$$GT$::extend_from_slice::h1dfb6096c32a99e1 vec.rs:1351
    #8 0x10333b177 in _$LT$libflate..lz77..default..DefaultLz77Encoder$u20$as$u20$libflate..lz77..Lz77Encode$GT$::encode::hba9253edc2c258af default.rs:65
    #9 0x103342b17 in a::main::h655e7f278e236686 main.rs:5
    #10 0x10332498d in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hba64913d4b77ef20 rt.rs:64
    #11 0x103360c47 in std::panicking::try::do_call::h6e8c55b5404ca92b panicking.rs:297
    #12 0x10336322e in __rust_maybe_catch_panic lib.rs:92
    #13 0x10336160d in std::rt::lang_start_internal::h616fb8704166427f rt.rs:48
    #14 0x1033248fe in std::rt::lang_start::h75c0523eec08eff1 rt.rs:64
    #15 0x103342d11 in main (a:x86_64+0x10002ed11)
    #16 0x7fff6363eed8 in start (libdyld.dylib:x86_64+0x16ed8)

SUMMARY: AddressSanitizer: heap-buffer-overflow default.rs:114 in libflate::lz77::default::prefix::hbc520ed13fc89540
Shadow bytes around the buggy address:
  0x1c03ffffffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c03ffffffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c03fffffff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1c0400000000: fa fa fd fd fa fa fd fd fa fa 00 00 fa fa 00 07
  0x1c0400000010: fa fa 00 04 fa fa 00 00 fa fa 00 06 fa fa fd fa
=>0x1c0400000020: fa fa 05 fa fa fa 00 00 fa fa[05]fa fa fa fd fd
  0x1c0400000030: fa fa fd fd fa fa fd fa fa fa 00 04 fa fa fa fa
  0x1c0400000040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400000050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400000060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400000070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==45039==ABORTING
Abort trap: 6

The relevant function contains unchecked read, which is totally unsafe (there is no guarantee that buf.len() >= 3; in this issue the buf.len() == 2).

#[inline]
fn prefix(buf: &[u8]) -> [u8; 3] {
unsafe {
[
*buf.get_unchecked(0),
*buf.get_unchecked(1),
*buf.get_unchecked(2),
]
}
}


A longer, equivalent version:

use libflate::gzip::Encoder;
use std::error::Error;
use std::io::Write;

fn main() -> Result<(), Box<Error>> {
    let mut encoder = Encoder::new(Vec::new())?;
    encoder.write(b"aaaaa")?;
    encoder.finish();
    Ok(())
}

Thank you for reporting this issue. I fixed this bug at the commit a14390c.

@sile Since you fixed this bug, should we close this issue? I am not sure if older versions of the crate should be patched as well?

By the way, fn prefix is definitely unsound: it does not validate that the length of the input is greater than 3. If left as-is, it should be marked unsafe fn. And this is not the first time this results in a real bug, see #16.

A better approach would be to subslice the input [0..2] up front and then index into that using the regular slice[x] syntax, without get_unchecked(). The optimizer is smart enough to figure out that the bounds checks for individual lookups are not needed.

Even this one bounds check per function can often be eliminated if the bounds are checked up front before calling this function in a tight loop, or by using iterators.

@DevQps I will close this issue. Thank you for your suggestion.

I am not sure if older versions of the crate should be patched as well?

If users of the older versions are using encoding functionality, I think it is better to upgrade libflate to up-to-date (but I believe this bug does not cause actual problems because it is only related to the handling of the end of buffers that will never be accessed anymore).

@Shnatsel Thank you for your advice. If the modification you suggested does not cause a drop in performance, I would like to adopt it. I will try the idea when I have a time for implementing and benchmarking it (of course, PR is very welcome).