evmar / n2

n2 ("into"), a ninja compatible build system

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Oversized WriteBuf panic

Colecf opened this issue · comments

When building android, we get this panic:

thread 'main' panicked at src/db.rs:70:17:
oversized WriteBuf
stack backtrace:
   0: std::panicking::begin_panic
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/panicking.rs:638:12
   1: n2::db::Writer::write_build
   2: n2::work::Work::record_finished
             at /ssd/n2/src/work.rs:500:9
   3: n2::work::Work::run
             at /ssd/n2/src/work.rs:771:21
   4: n2::run::build::{{closure}}
             at /ssd/n2/src/run.rs:83:45
   5: n2::trace::scope
             at /ssd/n2/src/trace.rs:116:21
   6: n2::run::build
             at /ssd/n2/src/run.rs:83:17
   7: n2::run::run_impl
             at /ssd/n2/src/run.rs:216:11
   8: n2::run::run
             at /ssd/n2/src/run.rs:238:15
   9: n2::main
             at /ssd/n2/src/main.rs:2:27
  10: core::ops::function::FnOnce::call_once
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Android splits its build into 2 ninja invocations, a "bootstrap" invocation, and a main invocation. The bootstrap invocation generates the main build.ninja file, and it has a depfile that discovers all the Android.bp files. The depfile has 11842 entries, so it causes the WriteBuf to overflow.

I see there's a comment on WriteBuf about how we could use a BufWriter instead, but it might be slightly less efficient. I'm thinking we should probably benchmark if that's the case. BufWriter does have a with_capacity constructor, so if we used the same buffer size I'm not sure how it would be any different from WriteBuf.

I think you are right. I also kind of suspect a BufWriter constructed even without the capacity set ahead of time might be enough. I think the main difference might be is that my WriteBuf allocates wholly on the stack, which means the Writer impl is free to allocate them in an inner function. Meanwhile a BufWriter is implemented using a Vec so it needs to allocate its buffer on the heap, which means it might be worth stashing as a member on the Writer struct for reuse.

In general if you had a big .n2_db from an Android build I'd love to see it, we could use it to benchmark db reading/writing in isolation from the rest of n2. I have no real picture right now if db read/write is even a bottleneck.

Oh I also just made https://github.com/evmar/n2/compare/main...Colecf:n2:fix_writebuf_panic?expand=1, but either solution works.

Here's the n2_db that triggers the issue (after fixing the issue), but it's just the n2_db from the bootstrap phase of the build, I haven't gotten the main build working yet. The main build's will be way bigger. This is also from AOSP, the google-internal build graph is bigger.

n2_db.zip