rust-lang / backtrace-rs

Backtraces in Rust

Home Page:https://docs.rs/backtrace

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Need to be able to reliably get symbol addrs

jswrenn opened this issue · comments

The documentation for Frame::symbol_address warns:

This will attempt to rewind the instruction pointer returned by ip to the start of the function, returning that value. In some cases, however, backends will just return ip from this function.

Consequently, the following code 'works' on x86_64-unknown-linux-gnu, but not on aarch64-apple-darwin:

use backtrace;
use std::{hint::black_box, ptr, ffi::c_void};

fn main() {
    black_box(function());
}

#[inline(never)]
fn function() {
    let function = function as *const c_void;
    println!("searching for symbol_address={:?}", function);

    backtrace::trace(|frame| {
        println!("unwound to {:?}", frame);
        if ptr::eq(frame.symbol_address(), function) {
            println!("found it!"); // not reached on aarch64-apple-darwin :(
            return false;
        }
        true
    });
}

Is this expected behavior on this platform? If so, is there any way to work around this discrepancy?

In the scoped-trace crate, I use symbol address equality to capture backtraces with limited upper and lower unwinding bounds. I'm hoping to get this crate working on aarch64-apple-darwin.

commented

On macOS the function to get the address of the enclosing function of an ip address (_Unwind_FindEnclosingFunction) is unreliable due to compact unwind info collapsing multiole functions with identical unwind info together:

// The macOS linker emits a "compact" unwind table that only includes an
If the executable is not stripped you can try parsing the executable itself using eg the object crate and finding the last symbol before the ip address.

If the executable is not stripped you can try parsing the executable itself using eg the object crate and finding the last symbol before the ip address.

That's not too bad! Would backtrace-rs accept a PR implementing this?

The symbolization already does that. I wonder why Symbol::addr doesn't return the address for symtab entries.

If I had to guess, it's because it looks like backtrace-rs currently uses information from DWARF xor symtab entries — not both. In a situation where DWARF debuginfo was completely unavailable, Frame::symbol_address might behave as expected.

backtrace-rs falls back to symtab entries if it can't find a DWARF entry. But both of those are only used in the symbolizer. Frame::symbol_address only uses the unwinder. It doesn't and shouldn't use DWARF or symbol table entries. You need to resolve the frame if you want to use those.

It seems like everything is working as intended, then? Shall we close this?

@workingjubilee Maaaybe? The comment here:

// The macOS linker emits a "compact" unwind table that only includes an
// entry for a function if that function either has an LSDA or its
// encoding differs from that of the previous entry. Consequently, on
// macOS, `_Unwind_FindEnclosingFunction` is unreliable (it can return a
// pointer to some totally unrelated function). Instead, we just always
// return the ip.
//
// https://github.com/rust-lang/rust/issues/74771#issuecomment-664056788
//
// Note the `skip_inner_frames.rs` test is skipped on macOS due to this
// clause, and if this is fixed that test in theory can be run on macOS!

...uses the phrase "if this is fixed" — which suggests that something unwelcome (albeit not unknown) is happening here.

Could we document this shortcoming? Or even make it explicit in the API by making ip an Option? Could we even eliminate this shortcoming? E.g.:

  • can compact unwinding be disabled?
  • can the DWARF unwinding tables be used instead?

I almost would rather if backtrace-rs used the unreliable output _Unwind_FindEnclosingFunction — then at least symbol_address would produce sometimes useful results on macOS, rather than always-useless (i.e., not more useful than ip) results.

commented

Compact unwinding can be disabled when linking a binary or library, but when compact unwinding was enabled when linking (as is done for all system libraries and by default for user code), there are no DWARF unwinding tables remaining.

I almost would rather if backtrace-rs used the unreliable output _Unwind_FindEnclosingFunction — then at least ip would produce sometimes useful results on macOS, rather than always-useless (i.e., not more useful than sp) results.

I did expect the current output to be useful for looking up in the symbol table which should always give the correct result if the symbol table exists at all. The result of _Unwind_FindEnclosingFunction may result in the wrong function without any option to get the correct result using the symbol table.

(Whoops, edited my last comment because I got my function names mixed up.)

I did expect the current output to be useful for looking up in the symbol table which should always give the correct result if the symbol table exists at all.

Am I right to think that you could instead use ip in this case?

Alternatively, could backtrace-rs do that look-up into the symbol table?

commented

Am I right to think that you could instead use ip in this case?

Right, you could.

Alternatively, could backtrace-rs do that look-up into the symbol table?

I think that would make sense.

The result of _Unwind_FindEnclosingFunction may result in the wrong function without any option to get the correct result using the symbol table.

Yeah, I think that kills the idea of using that on macOS dead. Any guess that might be wrong seems like it kinda breaks with what symbol_address says it does: it says it rewinds to the start of the function (implicit: correctly) or stays equal to ip, allowing you to detect which happens. It's better to simply return a value equal to ip if we're not going to produce a guaranteed-useful answer.

Regarding doing the table lookup implicitly, I don't think it should be completely off the (heh) table, but I'm slightly concerned about, and would like to hear an elaboration on, @philipc's perspective, namely:

It doesn't and shouldn't use DWARF or symbol table entries. You need to resolve the frame if you want to use those.

I can guess why this was said, but it's likely there's a nuance that hasn't been stated explicitly and that might be missing from the conversation so far.

I don't see any technical reason why the unwinder couldn't use the symbol table, but from a design perspective, this is something that the resolver is intended to do and already has code for, so I don't think it should be duplicated in the unwinder. I haven't seen a reason why the resolver can't be used in this case, but I haven't looked into the motivating use case (scoped-trace) at all.

While I think the resolver should be used for this purpose, I don't think it works correctly currently. Symbol::addr is documented to return the starting address of the function, and appears to do this for dbghelp, but it returns the unrelocated IP minus one for DWARF, and None for symbol tables.

Returning None seems okay, at least, in the sense that it's useless but not wrong. But the DWARF response seems simply incorrect.

This issue is no longer about Frame::symbol_address, which should probably remain untouched. Rather, it is about having a function that answers the desired use-case at all, is correct across platforms, and tries its alternatives until it succeeds or fails.

Yes, that sounds great. Again, for context: In the scoped-trace crate, I use symbol address equality to capture backtraces with limited upper and lower unwinding bounds. So I'd like to be able to call this function without doing full symbol resolution, or in situation where only symbol tables are available and not full DWARF debuginfo.