gimli-rs / gimli

A library for reading and writing the DWARF debugging format

Home Page:https://docs.rs/gimli/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fetching structs from compiled C

rtkaratekid opened this issue · comments

Hi! This is more of a question than issue, so I'm happy to close this if requested.

I'm looking for ways to fetch all structs from an ELF file with the Dwarf symbols (well one specific struct per elf file, maybe I can put those structs into their own ELF section to make it easier, or maybe I'll just try to find the one by some sort of naming convention).

I've successfully done this using pahole (pahole -S <elf file>), but want to try and bake this functionality directly into my software. I'm pretty new to parsing ELF and Dwarf though, is this library suited to something like this, or should I be spending my time looking elsewhere? I looked at the examples and wasn't able to fetch struct information with any of them.

Thanks ahead for the consideration.

Edit
Here's an example of pahole that demonstrates roughly what I'm looking to do

root@client:/vagrant/bpf-library/bpf_objs# pahole open -C op_data_t
struct op_data_t {
	u32                        id;                   /*     0     4 */
	u32                        pid;                  /*     4     4 */
	u32                        tgid;                 /*     8     4 */
	char                       comm[16];             /*    12    16 */
	char                       file[255];            /*    28   255 */
	/* --- cacheline 4 boundary (256 bytes) was 27 bytes ago --- */
	u32                        namespace;            /*   283     4 */
	u64                        time;                 /*   287     8 */

	/* size: 295, cachelines: 5, members: 7 */
	/* last cacheline: 39 bytes */
} __attribute__((__packed__));

Have you seen https://github.com/gimli-rs/ddbug? It uses gimli to among other things print the exact layout of types.

Thanks for the quick response! So I took ddbug and tested it, but I'm getting a lot of weird output. I actually wonder if it's because I'm targeting files that have been compiled by clang instead of gcc? I do have this issue with both ddbug and gimli. It starts with warnings about unsupported relocation (but only in ddbug, although may that also affects gimli) and then puts clang versions all over the place. Here's a sample:

Unsupported relocation for section .debug_info at offset 0x000070c4
Unsupported relocation for section .debug_info at offset 0x000070d0
Unsupported relocation for section .debug_info at offset 0x000070db
Unsupported relocation for section .debug_info at offset 0x000070e6
Unsupported relocation for section .debug_info at offset 0x000070f2
Unsupported relocation for section .debug_info at offset 0x000070fe
Unsupported relocation for section .debug_info at offset 0x00007109
Unsupported relocation for section .debug_info at offset 0x00007114
...

> snip

...
struct clang version 10.0.0-4ubuntu1
	size: 20
	members:
		0[4]	clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
		4[4]	clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
		8[4]	clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
		12[4]	clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1
		16[4]	clang version 10.0.0-4ubuntu1 : clang version 10.0.0-4ubuntu1

I couldn't figure out why this is, so I wrote a short C program to sanity check, compiled with gcc -g and clang -g, and then did ddbug a.out and tested the gimli example programs.
I was able to fetch everything without the weird clang spam, so it must be something about how my target files are being built that is preventing me from dumping this information. This may or may not be out of the scope of this library, but it's interesting to try and understand. Here are the flags I'm using from my Makefile:

COMPILERFLAGS := -D__KERNEL__ -D__ASM_SYSREG_H \
                                -D__BPF_TRACING__ -D__TARGET_ARCH_$(ARCH) \
                                -Wno-unused-value -Wno-pointer-sign \
                                -Wno-compare-distinct-pointer-types \
                                -Wno-gnu-variable-sized-type-not-at-end \
                                -Wno-address-of-packed-member \
                                -Wno-tautological-compare \
                                -Wno-unknown-warning-option \

The context of this is that I'm writing BPF programs and trying to pull their member structs out of them prior to loading to the kernel. I'm wanting to do this by parsing the compiled file, but if this clang thing is an issue I may have to stick with some sort of config file solution instead.

Edit
Just to mention I'm also using -03 -emit-llvm -g -c as compiler flags. So yes, there should be the debug info included. We do the linking seperately, I don't think that would cause the issue, but I'm pretty novice with details on debugging symbols.

Unsupported relocation for section .debug_info at offset 0x000070c4
The context of this is that I'm writing BPF programs

We'll need to add support for BPF relocations to the object crate.

Thanks for commenting. I'm actually happy to do that, but it might take someone who has the expertise a lot less time than me.

I should be able to do that later today.

Can you show me the readelf -r output for .debug_info in your file? I'm getting this:

Relocation section '.rel.debug_info' at offset 0x380 contains 9 entries:
  Offset          Info           Type           Sym. Value    Sym. Name
000000000006  00080000000a R_BPF_INSN_DISP32 0000000000000000 .debug_abbrev
00000000000c  00020000000a R_BPF_INSN_DISP32 0000000000000000 <null>
000000000012  00030000000a R_BPF_INSN_DISP32 000000000000001f <null>
000000000016  000a0000000a R_BPF_INSN_DISP32 0000000000000000 .debug_line
00000000001a  00040000000a R_BPF_INSN_DISP32 0000000000000025 <null>
00000000001e  000700000001 R_BPF_INSN_64     0000000000000000 .text
00000000002b  000700000001 R_BPF_INSN_64     0000000000000000 .text
000000000039  00050000000a R_BPF_INSN_DISP32 0000000000000044 <null>
000000000044  00060000000a R_BPF_INSN_DISP32 0000000000000049 <null>

This looks weird to me, because R_BPF_INSN_DISP32 seems to be a PC-relative relocation, but that doesn't make much sense in debuginfo. Additionally, many of them are missing the associated section (probably .debug_str). For comparison, here's the same source compiled for x86-64:

Relocation section '.rela.debug_info' at offset 0x250 contains 9 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000006  00040000000a R_X86_64_32       0000000000000000 .debug_abbrev + 0
00000000000c  00030000000a R_X86_64_32       0000000000000000 .debug_str + 0
000000000012  00030000000a R_X86_64_32       0000000000000000 .debug_str + 1f
000000000016  00050000000a R_X86_64_32       0000000000000000 .debug_line + 0
00000000001a  00030000000a R_X86_64_32       0000000000000000 .debug_str + 25
00000000001e  000200000001 R_X86_64_64       0000000000000000 .text + 0
00000000002b  000200000001 R_X86_64_64       0000000000000000 .text + 0
000000000039  00030000000a R_X86_64_32       0000000000000000 .debug_str + 44
000000000044  00030000000a R_X86_64_32       0000000000000000 .debug_str + 49

I could hack this to treat R_BPF_INSN_DISP32 like R_BPF_DATA_32 instead, but it seems strange and I might be missing something. Similarly I would expect R_BPF_INSN_64 to be R_BPF_DATA_64 instead.

Looks like the relocation definitions changed and my readelf wasn't updated yet. gimli-rs/object#279 should fix the relocation support. You'll need to patch parser/Cargo.toml in ddbug to use it.

Thanks for the support! I'll be testing out the patched ddbug a bit later today. In the meantime, here's the .rel.debug_info section of readelf -r. It might not necessarily be relevant now that you've added a fix, but I thought I'd at least document that I have a similar output as you did on that above output. I'm going to snip it for brevity but let me know if you want the whole section.

Relocation section '.rel.debug_info' at offset 0x19258 contains 2008 entries:
  Offset          Info           Type           Sym. Value    Sym. Name
000000000006  06b70000000a R_BPF_INSN_DISP32 0000000000000000 .debug_abbrev
00000000000c  00020000000a R_BPF_INSN_DISP32 0000000000000000 <null>
000000000012  00030000000a R_BPF_INSN_DISP32 000000000000001f <null>
000000000016  06b90000000a R_BPF_INSN_DISP32 0000000000000000 .debug_line
00000000001a  00040000000a R_BPF_INSN_DISP32 000000000000002f <null>
000000000026  06b80000000a R_BPF_INSN_DISP32 0000000000000000 .debug_ranges
00000000002b  00050000000a R_BPF_INSN_DISP32 0000000000000044 <null>
000000000037  06ba00000001 R_BPF_INSN_64     0000000000000000 GUID_OPENSNOOP
000000000044  00080000000a R_BPF_INSN_DISP32 0000000000000066 <null>
00000000004f  00070000000a R_BPF_INSN_DISP32 0000000000000060 <null>
000000000056  00060000000a R_BPF_INSN_DISP32 0000000000000053 <null>
00000000005d  00090000000a R_BPF_INSN_DISP32 000000000000006a <null>
000000000069  06bd00000001 R_BPF_INSN_64     0000000000000000 opensnoop_events
000000000072  000f0000000a R_BPF_INSN_DISP32 00000000000000aa <null>
00000000007a  000a0000000a R_BPF_INSN_DISP32 000000000000007b <null>
000000000086  000b0000000a R_BPF_INSN_DISP32 0000000000000080 <null>
000000000092  000c0000000a R_BPF_INSN_DISP32 0000000000000089 <null>
00000000009e  000d0000000a R_BPF_INSN_DISP32 0000000000000094 <null>
0000000000aa  000e0000000a R_BPF_INSN_DISP32 00000000000000a0 <null>
0000000000b7  00100000000a R_BPF_INSN_DISP32 00000000000000b6 <null>
>snip

Tested out ddbug on a pretty standard BPF program with the new object patch (thanks philipc!). When compiled in debug mode it panics (can include the full trace if needed).

$ RUST_BACKTRACE=1 ./ddbug /vagrant/bpf-library/bpf_objs/opensnoop.bpf.o
thread 'main' panicked at 'attempt to subtract with overflow', main/src/print/file.rs:151:34
stack backtrace:
   0: rust_begin_unwind
             at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:495:5
   1: core::panicking::panic_fmt
             at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:92:14
   2: core::panicking::panic
             at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:50:5
   3: ddbug::print::file::print::{{closure}}
             at /home/vagrant/ddbug/main/src/print/file.rs:151:34
   4: ddbug::print::PrintState::indent_impl::{{closure}}
             at /home/vagrant/ddbug/main/src/print/mod.rs:150:13
   5: <ddbug::print::text::TextPrinter as ddbug::print::Printer>::indent_body
             at /home/vagrant/ddbug/main/src/print/text.rs:102:9
   6: ddbug::print::PrintState::indent_impl
             at /home/vagrant/ddbug/main/src/print/mod.rs:148:9
   7: ddbug::print::PrintState::collapsed
             at /home/vagrant/ddbug/main/src/print/mod.rs:171:9
   8: ddbug::print::file::print
             at /home/vagrant/ddbug/main/src/print/file.rs:139:9
   9: ddbug::print_file::{{closure}}
             at /home/vagrant/ddbug/main/src/main.rs:437:31
  10: ddbug::format
             at /home/vagrant/ddbug/main/src/main.rs:453:9
  11: ddbug::print_file
             at /home/vagrant/ddbug/main/src/main.rs:437:5
  12: ddbug::main::{{closure}}
             at /home/vagrant/ddbug/main/src/main.rs:417:57
  13: ddbug_parser::file::File::parse_object::{{closure}}
             at /home/vagrant/ddbug/parser/src/file/mod.rs:271:13
  14: ddbug_parser::file::dwarf::parse
             at /home/vagrant/ddbug/parser/src/file/dwarf.rs:453:5
  15: ddbug_parser::file::File::parse_object
             at /home/vagrant/ddbug/parser/src/file/mod.rs:259:9
  16: ddbug_parser::file::File::parse
             at /home/vagrant/ddbug/parser/src/file/mod.rs:157:9
  17: ddbug::main
             at /home/vagrant/ddbug/main/src/main.rs:417:25
  18: core::ops::function::FnOnce::call_once
             at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

However, when compiled with --release it works as expected, which is fantastic. If I can figure out why the debug build panics I'll try to fix and submit a patch (of course unless I'm beat to the punch). Otherwise I think this issue is resolved and want to express my gratitude for the help!

Looks like llvm-readobj is probably better than readelf, but even that isn't great for viewing these relocations because there appears to be an unnamed symbol for every string in .debug_str, which is still strange but at least things work...

I pushed a ddbug commit to fix that overflow. It's in some informational size calculations that need to use heuristics so they aren't quite right sometimes, but you probably don't care about them. Be aware that ddbug is still experimental and not well maintained.