eliben / pyelftools

Parsing ELF and DWARF in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get DW_OP_addr from DW_FORM_sec_offset

Samuel-Fipps opened this issue · comments

I have script I have been using for a long time, but now it's getting DW_FORM_sec_offset's. It was only made for DW_FORM_exprloc.

Before I got the DW_OP_addr: by doing
DWDESC.describe_DWARF_expr(die.attributes['DW_AT_location'].value, die.cu.structs)

This doesn't work with DW_FORM_sec_offset though.
I get this error:
return b''.join(bytes((b,)) for b in bytelist) TypeError: 'int' object is not iterable
Which is expected since the format is differen't.

If I set debugger optimization's to zero I don't get the DW_FORM_sec_offset's is there anyway to include these?

I presume describe_DWARF_expr is your script. I don't know how does it work, and therefore can't debug it for you. In general, pyelftools has a general purpose location list parser - method parse_from_attribute in class LocationParser.

One thing you might be missing, a variable may have several locations depending on where in the code you are. Was your code written with location lists in mind? It doesn't seem that way. DWARF accommodates that by having a "location list" - that's a collection of IP range to DWARF expression mappings. In order to know the location of a variable in this scenario, you need to know the position in code where you are.

Yet another thing of note, not all DWARF locations have the form of a single DW_OP_addr. A variable may be, say, in a register.

Here is a source of inspiration for you: https://github.com/eliben/pyelftools/blob/master/examples/dwarf_location_info.py . Notably, it recognizes that a DW_AT_location may contain either a DWARF expression, or a reference to a location list.

@Samuel-Fipps Is this still an issue?

@Samuel-Fipps Is this still an issue?

I think I figured out that it can't be done. Because when the code is compiled it was compiled using gcc -o3 (optimization level)

Optimized code is almost guaranteed to:

  • discard variables the moment they are no longer needed
  • store them in strange places and forms, with a plain old fixed memory location being a lucky exception
  • store them in different places depending on where in the code you are

DWARF is flexible enough to capture that. Why do you think DW_OP_* is an ever growing enum of options? But parsing that kind of DWARF is tricky.

On a side note, I've seen location lists somewhat misgenerated - the variable is available at a certain code address, but DWARF claims it has no location expression there. It was off by just one CPU instruction though.

Either way, unless you decide to pursue this further, please close the issue.