eliben / pyelftools

Parsing ELF and DWARF in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

elftools.common.exceptions.DWARFError: For this binary, "die" needs to be provided

msswamy11 opened this issue · comments

I have clone latest pyelftools repo.
I am using Zephyr for Synapsys ARC processor, trying to generate ram_report with below command.

ninja -Coutput/custom/build/ ram_report
Error:
low, high = get_die_mapped_address(die, parser, dwarfinfo)
File "/home/trees/embedded/output/custom/scripts/footprint/size_report", line 84, in get_die_mapped_address
loc = parser.parse_from_attribute(loc_attr, die.cu['version'])
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 309, in parse_from_attribute
return self.location_lists.get_location_list_at_offset(
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 48, in get_location_list_at_offset
raise DWARFError("For this binary, "die" needs to be provided")
elftools.common.exceptions.DWARFError: For this binary, "die" needs to be provided

I have restrictions to share binary to you, let me know any debug data required for you.

Replace the line

loc = parser.parse_from_attribute(loc_attr, die.cu['version'])

with

loc = parser.parse_from_attribute(loc_attr, die.cu['version'], die)

In DWARFv5, parsing a loclist to usable values might require knowing what DIE did the loclist come from.

after change the below error came,

loc = parser.parse_from_attribute(loc_attr, die.cu['version'], die)
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 309, in parse_from_attribute
return self.location_lists.get_location_list_at_offset(
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 50, in get_location_list_at_offset
return section.get_location_list_at_offset(offset, die)
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 98, in get_location_list_at_offset
return self._parse_location_list_from_stream_v5(die.cu) if self.version >= 5 else self._parse_location_list_from_stream()
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 224, in _parse_location_list_from_stream
loc_expr = [struct_parse(self.structs.Dwarf_uint8(''),
File "/home/.local/lib/python3.8/site-packages/elftools/dwarf/locationlists.py", line 224, in
loc_expr = [struct_parse(self.structs.Dwarf_uint8(''),
File "/home/.local/lib/python3.8/site-packages/elftools/common/utils.py", line 45, in struct_parse
raise ELFParseError(str(e))
elftools.common.exceptions.ELFParseError: expected 1, found 0

This is the same error as #513, Same questions, then.

I used below command to find DWARF version, I could see version2 and 5.

readelf --debug-dump=info zephyr.elf | grep -A 2 'Compilation Unit @'
--
Compilation Unit @ offset 0x1f23c6:
Length: 0x8da (32-bit)
Version: 2
--
Compilation Unit @ offset 0x1f2ca4:
Length: 0x2d (32-bit)
Version: 5

Attached file with atttr argument value printed in function parse_from_attribute.
parse_from_attribute_attr.txt

Can you tell what's the DWARF version in the CU of the failing attribute? It's at offset 0x9F62. Find the compilation unit with the largest offset that is smaller than 0x9F62.

Also, can you tell what kind of DIE does the failing attribute is in?

You could use DWARF Explorer ( https://github.com/sevaa/dwex ) to check the DWARF tree visually. Open the binary, use menu/Navigate/"DIE by offset" for 0x9F62, and take it from there. The DWARF version of the selected DIE's CU can be seen via menu/View/"CU properties".

The reason I'm interested, in DWARFv2 a loclist reference of form DW_FORM_data4 would be expected, but in a v5 one there are special forms DW_FORM_sec_offset and DW_FORM_loclistx specifically for that purpose. I have never heard that there's a provision for DW_AT_location to be anything other than either an expression or a loclist reference, but what do you know.

Sorry, I dont have any knowledge on elf file formats, out of interest started this issue to learn.

Failing die:
DIE DW_TAG_variable, size=21, has_children=False
|DW_AT_name : AttributeValue(name='DW_AT_name', form='DW_FORM_strp', value=b'mbx', raw_value=8759205, offset=40790, indirection_length=0)
|DW_AT_decl_file : AttributeValue(name='DW_AT_decl_file', form='DW_FORM_data1', value=1, raw_value=1, offset=40794, indirection_length=0)
|DW_AT_decl_line : AttributeValue(name='DW_AT_decl_line', form='DW_FORM_data2', value=868, raw_value=868, offset=40795, indirection_length=0)
|DW_AT_decl_column : AttributeValue(name='DW_AT_decl_column', form='DW_FORM_data1', value=17, raw_value=17, offset=40797, indirection_length=0)
|DW_AT_type : AttributeValue(name='DW_AT_type', form='DW_FORM_ref4', value=17581, raw_value=17581, offset=40798, indirection_length=0)
|DW_AT_location : AttributeValue(name='DW_AT_location', form='DW_FORM_data4', value=5508, raw_value=5508, offset=40802, indirection_length=0)
|DW_AT_GNU_locviews: AttributeValue(name='DW_AT_GNU_locviews', form='DW_FORM_data4', value=5504, raw_value=5504, offset=40806, indirection_length=0)

Below is failing attribute in die:
AttributeValue(name='DW_AT_location', form='DW_FORM_data4', value=5508, raw_value=5508, offset=40802, indirection_length=0)

DWARF version in the CU of the failing attribute is 2.

0x9f62_offset

Now we are on to something. Hold on please...

UPDATE: get the latest DWARF Explorer please, see if that makes a difference.

Upgraded to dwex 3.22, even I am facing same parse error for DW_AT_location die offset 0x9f62.

I've isolated what I think should be the crash behavior to a small snippet, and published it as a gist: https://gist.github.com/sevaa/af048829b87d093b2606daed6f7185c3

Can you please download it, save it as ll.py and run with two arguments - the path to your binary, and the offset of the offending DIE, in hex (0x9f62)? Does it crash and print a bunch of text? If so, then can you please:

  • share the output with me (it's the local variables from the crash point)
  • check if the locationlists.py that the crash output points at (the file path/name will be in the first line of crash output) matches the one in the master copy of pyelftools?

Because, from what I know about your binary so far, the loclist in question should parse fine. It's not malformed in any way that I can see. So I can't exclude the possibility that the copy of pyelftools that Zephyr/DWEX are using is old and/or somehow patched.

Sorry it's taking so long, but remote debugging by proxy is never pretty or fast.

No issues, please bear with my late replies. I can understand the pain of proxy debug.
Script was crashed, below is crash image.
ll_py_error

Sorry, the DIE offset (as opposed to the attribute offset) is 0x9f55. Try with that please.

Great, some dump it was printing
ll_9f55.txt

check if the locationlists.py that the crash line matching - Yes matching line number 224.

Updated the gist. Please get latest and run it. I think I've found the issue.

murala@murala-VirtualBox:~/test$ python3 ll.py zephyr.elf 0x9f55
Parsed

Okay, good. I'll submit a PR for pyelftools, but in the meantime, you may use the following monkeypatch near the beginning of your script:

from elftools.dwarf.dwarfinfo import DWARFInfo
from elftools.dwarf.locationlists import LocationLists, LocationListsPair

def location_lists(self):
    """ Get a LocationLists object representing the .debug_loc/debug_loclists section of
        the DWARF data, or None if this section doesn't exist.
        If both sections exist, it returns a LocationListsPair.
    """
    if self.debug_loclists_sec and self.debug_loc_sec is None:
        return LocationLists(self.debug_loclists_sec.stream, self.structs, 5, self)
    elif self.debug_loc_sec and self.debug_loclists_sec is None:
        return LocationLists(self.debug_loc_sec.stream, self.structs, 4, self)
    elif self.debug_loc_sec and self.debug_loclists_sec:
        return LocationListsPair(self.debug_loc_sec.stream, self.debug_loclists_sec.stream, self.structs, self)
    else:
        return None
    
DWARFInfo.location_lists = location_lists

Alternatively, you may download https://github.com/sevaa/dwex/blob/master/dwex/patch.py and call monkeypatch() from that file. It addresses a bunch of recently found issues, not just this one.

EDIT: updated dwex, submitted a PR.
EDIT: the PR landed.

Sorry for the late update, busy with some other work.
I am able to generate ram_report with latest pyelftools update.
Thanks for your support.

If the patch worked, close the issue.