mozilla / dump_syms

Rewrite of breakpad dump_syms tools in Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Don't emit PUBLIC symbols for areas covered by FUNC symbols when dumping ELF files

gabrielesvelto opened this issue · comments

I was looking at a libxul.so dump of a GeckoView ARM build and stumbled into a number of entries that looked like this one:

FUNC 79b94a8 58 0 mdb_env_cthr_toggle
79b94a8 8 9135 16162
79b94b0 e 9136 16162
79b94be 2 9137 16162
79b94c0 4 9138 16162
79b94c4 4 9137 16162
79b94c8 6 9138 16162
79b94ce 8 9139 16162
79b94d6 a 9140 16162
79b94e0 6 9141 16162
79b94e6 c 9143 16162
79b94f2 6 9145 16162
79b94f8 8 9146 16162
PUBLIC 79b94a9 0 mdb_env_cthr_toggle

Note how we have a FUNC entry that covers a certain range and then a PUBLIC entry that's inside that range. Our logic inserts public symbols only if there's no corresponding entry existing at a given address. However this only matches the address, it doesn't take into account the ranges of the functions we already inserted in the symbol map. We should change this so that if a given range is already covered by a FUNC we won't emit a PUBLIC.

As a side note: this should make our .sym files smaller on Linux.

I was looking at a libxul.so dump of a GeckoView ARM build and stumbled into a number of entries that looked like this one:

Do you have a link to this .sym file?

I've looked for one but couldn't find one immediately. I could reproduce the issue by generating one from my local build, maybe I'll try pushing it to try and see if I get something that looks like that.

Thanks for checking! I was just curious how a symbolic symcache generated from the same binary would treat this situation. I might take a closer look at it again once we get to that point.

See the libxul symbols in this archive from this task.

Thanks!

So it actually looks like this is the case for all functions. The FUNC is always at an even address, and the PUBLIC symbol is that address plus one.
Since this is arm, instructions are two-byte aligned. So I think what happens here is that the PUBLIC symbol really is the address plus the bit that says whether to execute in Thumb mode or not. (See also this comment.)

It's interesting because it means that symbolication has to make a choice: Which address does it bless as the symbol address? The value that the instruction pointer register should be set to when calling this function, or the offset at which an instruction decoder / disassembler should begin reading?