Fetching structs from compiled C
rtkaratekid opened this issue · comments
Hi! This is more of a question than issue, so I'm happy to close this if requested.
I'm looking for ways to fetch all structs from an ELF file with the Dwarf symbols (well one specific struct per elf file, maybe I can put those structs into their own ELF section to make it easier, or maybe I'll just try to find the one by some sort of naming convention).
I've successfully done this using pahole (pahole -S <elf file>
), but want to try and bake this functionality directly into my software. I'm pretty new to parsing ELF and Dwarf though, is this library suited to something like this, or should I be spending my time looking elsewhere? I looked at the examples and wasn't able to fetch struct information with any of them.
Thanks ahead for the consideration.
Edit
Here's an example of pahole that demonstrates roughly what I'm looking to do
root@client:/vagrant/bpf-library/bpf_objs# pahole open -C op_data_t
struct op_data_t {
u32 id; /* 0 4 */
u32 pid; /* 4 4 */
u32 tgid; /* 8 4 */
char comm[16]; /* 12 16 */
char file[255]; /* 28 255 */
/* --- cacheline 4 boundary (256 bytes) was 27 bytes ago --- */
u32 namespace; /* 283 4 */
u64 time; /* 287 8 */
/* size: 295, cachelines: 5, members: 7 */
/* last cacheline: 39 bytes */
} __attribute__((__packed__));
Have you seen https://github.com/gimli-rs/ddbug? It uses gimli to among other things print the exact layout of types.
Thanks for the quick response! So I took ddbug
and tested it, but I'm getting a lot of weird output. I actually wonder if it's because I'm targeting files that have been compiled by clang
instead of gcc
? I do have this issue with both ddbug and gimli. It starts with warnings about unsupported relocation (but only in ddbug, although may that also affects gimli) and then puts clang versions all over the place. Here's a sample:
Unsupported relocation for section .debug_info at offset 0x000070c4
Unsupported relocation for section .debug_info at offset 0x000070d0
Unsupported relocation for section .debug_info at offset 0x000070db
Unsupported relocation for section .debug_info at offset 0x000070e6
Unsupported relocation for section .debug_info at offset 0x000070f2
Unsupported relocation for section .debug_info at offset 0x000070fe
Unsupported relocation for section .debug_info at offset 0x00007109
Unsupported relocation for section .debug_info at offset 0x00007114
...
> snip
...
struct clang version 10.0.0-4ubuntu1
size: 20
members:
0[4] clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
4[4] clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
8[4] clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
12[4] clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
16[4] clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
I couldn't figure out why this is, so I wrote a short C program to sanity check, compiled with gcc -g
and clang -g
, and then did ddbug a.out
and tested the gimli example programs.
I was able to fetch everything without the weird clang spam, so it must be something about how my target files are being built that is preventing me from dumping this information. This may or may not be out of the scope of this library, but it's interesting to try and understand. Here are the flags I'm using from my Makefile:
COMPILERFLAGS := -D__KERNEL__ -D__ASM_SYSREG_H \
-D__BPF_TRACING__ -D__TARGET_ARCH_$(ARCH) \
-Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member \
-Wno-tautological-compare \
-Wno-unknown-warning-option \
The context of this is that I'm writing BPF programs and trying to pull their member structs out of them prior to loading to the kernel. I'm wanting to do this by parsing the compiled file, but if this clang thing is an issue I may have to stick with some sort of config file solution instead.
Edit
Just to mention I'm also using -03 -emit-llvm -g -c
as compiler flags. So yes, there should be the debug info included. We do the linking seperately, I don't think that would cause the issue, but I'm pretty novice with details on debugging symbols.
Unsupported relocation for section .debug_info at offset 0x000070c4
The context of this is that I'm writing BPF programs
We'll need to add support for BPF relocations to the object
crate.
Thanks for commenting. I'm actually happy to do that, but it might take someone who has the expertise a lot less time than me.
I should be able to do that later today.
Can you show me the readelf -r
output for .debug_info
in your file? I'm getting this:
Relocation section '.rel.debug_info' at offset 0x380 contains 9 entries:
Offset Info Type Sym. Value Sym. Name
000000000006 00080000000a R_BPF_INSN_DISP32 0000000000000000 .debug_abbrev
00000000000c 00020000000a R_BPF_INSN_DISP32 0000000000000000 <null>
000000000012 00030000000a R_BPF_INSN_DISP32 000000000000001f <null>
000000000016 000a0000000a R_BPF_INSN_DISP32 0000000000000000 .debug_line
00000000001a 00040000000a R_BPF_INSN_DISP32 0000000000000025 <null>
00000000001e 000700000001 R_BPF_INSN_64 0000000000000000 .text
00000000002b 000700000001 R_BPF_INSN_64 0000000000000000 .text
000000000039 00050000000a R_BPF_INSN_DISP32 0000000000000044 <null>
000000000044 00060000000a R_BPF_INSN_DISP32 0000000000000049 <null>
This looks weird to me, because R_BPF_INSN_DISP32
seems to be a PC-relative relocation, but that doesn't make much sense in debuginfo. Additionally, many of them are missing the associated section (probably .debug_str
). For comparison, here's the same source compiled for x86-64:
Relocation section '.rela.debug_info' at offset 0x250 contains 9 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000006 00040000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0
00000000000c 00030000000a R_X86_64_32 0000000000000000 .debug_str + 0
000000000012 00030000000a R_X86_64_32 0000000000000000 .debug_str + 1f
000000000016 00050000000a R_X86_64_32 0000000000000000 .debug_line + 0
00000000001a 00030000000a R_X86_64_32 0000000000000000 .debug_str + 25
00000000001e 000200000001 R_X86_64_64 0000000000000000 .text + 0
00000000002b 000200000001 R_X86_64_64 0000000000000000 .text + 0
000000000039 00030000000a R_X86_64_32 0000000000000000 .debug_str + 44
000000000044 00030000000a R_X86_64_32 0000000000000000 .debug_str + 49
I could hack this to treat R_BPF_INSN_DISP32
like R_BPF_DATA_32
instead, but it seems strange and I might be missing something. Similarly I would expect R_BPF_INSN_64
to be R_BPF_DATA_64
instead.
Looks like the relocation definitions changed and my readelf wasn't updated yet. gimli-rs/object#279 should fix the relocation support. You'll need to patch parser/Cargo.toml
in ddbug
to use it.
Thanks for the support! I'll be testing out the patched ddbug a bit later today. In the meantime, here's the .rel.debug_info
section of readelf -r
. It might not necessarily be relevant now that you've added a fix, but I thought I'd at least document that I have a similar output as you did on that above output. I'm going to snip it for brevity but let me know if you want the whole section.
Relocation section '.rel.debug_info' at offset 0x19258 contains 2008 entries:
Offset Info Type Sym. Value Sym. Name
000000000006 06b70000000a R_BPF_INSN_DISP32 0000000000000000 .debug_abbrev
00000000000c 00020000000a R_BPF_INSN_DISP32 0000000000000000 <null>
000000000012 00030000000a R_BPF_INSN_DISP32 000000000000001f <null>
000000000016 06b90000000a R_BPF_INSN_DISP32 0000000000000000 .debug_line
00000000001a 00040000000a R_BPF_INSN_DISP32 000000000000002f <null>
000000000026 06b80000000a R_BPF_INSN_DISP32 0000000000000000 .debug_ranges
00000000002b 00050000000a R_BPF_INSN_DISP32 0000000000000044 <null>
000000000037 06ba00000001 R_BPF_INSN_64 0000000000000000 GUID_OPENSNOOP
000000000044 00080000000a R_BPF_INSN_DISP32 0000000000000066 <null>
00000000004f 00070000000a R_BPF_INSN_DISP32 0000000000000060 <null>
000000000056 00060000000a R_BPF_INSN_DISP32 0000000000000053 <null>
00000000005d 00090000000a R_BPF_INSN_DISP32 000000000000006a <null>
000000000069 06bd00000001 R_BPF_INSN_64 0000000000000000 opensnoop_events
000000000072 000f0000000a R_BPF_INSN_DISP32 00000000000000aa <null>
00000000007a 000a0000000a R_BPF_INSN_DISP32 000000000000007b <null>
000000000086 000b0000000a R_BPF_INSN_DISP32 0000000000000080 <null>
000000000092 000c0000000a R_BPF_INSN_DISP32 0000000000000089 <null>
00000000009e 000d0000000a R_BPF_INSN_DISP32 0000000000000094 <null>
0000000000aa 000e0000000a R_BPF_INSN_DISP32 00000000000000a0 <null>
0000000000b7 00100000000a R_BPF_INSN_DISP32 00000000000000b6 <null>
>snip
Tested out ddbug on a pretty standard BPF program with the new object
patch (thanks philipc!). When compiled in debug mode it panics (can include the full trace if needed).
$ RUST_BACKTRACE=1 ./ddbug /vagrant/bpf-library/bpf_objs/opensnoop.bpf.o
thread 'main' panicked at 'attempt to subtract with overflow', main/src/print/file.rs:151:34
stack backtrace:
0: rust_begin_unwind
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:495:5
1: core::panicking::panic_fmt
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:92:14
2: core::panicking::panic
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:50:5
3: ddbug::print::file::print::{{closure}}
at /home/vagrant/ddbug/main/src/print/file.rs:151:34
4: ddbug::print::PrintState::indent_impl::{{closure}}
at /home/vagrant/ddbug/main/src/print/mod.rs:150:13
5: <ddbug::print::text::TextPrinter as ddbug::print::Printer>::indent_body
at /home/vagrant/ddbug/main/src/print/text.rs:102:9
6: ddbug::print::PrintState::indent_impl
at /home/vagrant/ddbug/main/src/print/mod.rs:148:9
7: ddbug::print::PrintState::collapsed
at /home/vagrant/ddbug/main/src/print/mod.rs:171:9
8: ddbug::print::file::print
at /home/vagrant/ddbug/main/src/print/file.rs:139:9
9: ddbug::print_file::{{closure}}
at /home/vagrant/ddbug/main/src/main.rs:437:31
10: ddbug::format
at /home/vagrant/ddbug/main/src/main.rs:453:9
11: ddbug::print_file
at /home/vagrant/ddbug/main/src/main.rs:437:5
12: ddbug::main::{{closure}}
at /home/vagrant/ddbug/main/src/main.rs:417:57
13: ddbug_parser::file::File::parse_object::{{closure}}
at /home/vagrant/ddbug/parser/src/file/mod.rs:271:13
14: ddbug_parser::file::dwarf::parse
at /home/vagrant/ddbug/parser/src/file/dwarf.rs:453:5
15: ddbug_parser::file::File::parse_object
at /home/vagrant/ddbug/parser/src/file/mod.rs:259:9
16: ddbug_parser::file::File::parse
at /home/vagrant/ddbug/parser/src/file/mod.rs:157:9
17: ddbug::main
at /home/vagrant/ddbug/main/src/main.rs:417:25
18: core::ops::function::FnOnce::call_once
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
However, when compiled with --release
it works as expected, which is fantastic. If I can figure out why the debug build panics I'll try to fix and submit a patch (of course unless I'm beat to the punch). Otherwise I think this issue is resolved and want to express my gratitude for the help!
Looks like llvm-readobj
is probably better than readelf
, but even that isn't great for viewing these relocations because there appears to be an unnamed symbol for every string in .debug_str
, which is still strange but at least things work...
I pushed a ddbug
commit to fix that overflow. It's in some informational size calculations that need to use heuristics so they aren't quite right sometimes, but you probably don't care about them. Be aware that ddbug
is still experimental and not well maintained.