google / bloaty

Bloaty: a size profiler for binaries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rust rlib: Premature EOF in AR data

tweksteen opened this issue · comments

Some rlibs (Rust static archives, relying on the AR format), trigger a malformed member error when parsed. This does not appear to occur for all archives, only specific ones:

Reproduction steps:

$ rustc --version
rustc 1.51.0-nightly (c5a96fb79 2021-01-19)
$ mkdir mylib; cd mylib
$ echo 'fn abaaaab() { println!("test");}' > lib.rs
$ cargo init --lib
     Created binary (application) package
$ cargo build 
  [...]
$ bloaty ./target/debug/libmylib.rlib 
bloaty: Premature EOF in AR data

When replacing the library content, the parsing is successful:

$ echo 'fn aaaaaab() { println!("test");}' > lib.rs
$ cargo build
$ bloaty ./target/debug/libmylib.rlib
    FILE SIZE        VM SIZE
 --------------  --------------
  56.9%  1.43Ki   0.0%       0    [AR Non-ELF Member File]
  12.0%     308   0.0%       0    [AR Headers]
  10.0%     256   0.0%       0    [ELF Headers]
   7.5%     192   0.0%       0    .strtab
   6.6%     168 100.0%      34    .debug_gdb_scripts
   5.3%     136   0.0%       0    .symtab
   1.7%      44   0.0%       0    [AR Symbol Table]
 100.0%  2.50Ki 100.0%      34    TOTAL

In both cases, the archive is correctly parsed by llvm-ar-11, without warnings.

When running through gdb, ArFile::MemberReader::ReadMember is hit 5 times. I believe one for the symbol table, one for the long filename table and twice for the files (2 files in the archive). The last time, remaining_ only contains one byte:

Thread 2 "bloaty" hit Breakpoint 1, bloaty::(anonymous namespace)::ArFile::MemberReader::ReadMember (this=0x7ffff7744600, file=0x7ffff7744620) at /src/bloaty/src/elf.cc:689
689       if (remaining_.size() == 0) {
(gdb) p remaining_
$1 = {static npos = 18446744073709551615, static kMaxSize = 9223372036854775807, ptr_ = 0x7ffff7ffb9dd "\n", length_ = 1}

There seems to be some alignment required by the format. From Wikipedia:

Each data section is 2 byte aligned. If it would end on an odd offset, a newline ('\n', 0x0A) is used as filler.

I don't think the parsing of that filling is implemented yet.