Error while parsing Microchip ELF

Question

Error while parsing Microchip ELF

olimexsmart opened this issue 8 months ago · comments

Hi, I'm trying to use the library with the ELF file generated by the Microchip xc16 compiler.

For example the examine_dwarf_info.py preview crashes with

  File "C:\Python311\Lib\site-packages\elftools\dwarf\structs.py", line 106, in __new__
    assert address_size == 8 or address_size == 4, str(address_size)

In general also the out-of-the-box readelf has some trouble with it, the only tool I've seen working is the one bundle along side the compiler: xc16-readelf

Any suggestion in general? Thanks
test.elf.pdf

Edit: GitHub won't let me upload ELF file, so I appended "pdf" to the name

Seva Alekseyev (he/him) · Answer 1 · Fri Nov 10 2023 03:21:54 GMT+0800 (China Standard Time)

Somewhat known issue, see #473 near the end. XC16 emits a highly unusual flavor of DWARF, definitely not compliant with the standard (on the lowest possible level, like integer encoding). In general, it's not this project's ambition to venture beyond the scope of GNU/LLVM tools out there - for one thing, correctness checks become a pain.

Do you know if there is a doc for their flavor of DWARF out there?

Luca Olivieri · Answer 2 · Fri Nov 10 2023 03:37:14 GMT+0800 (China Standard Time)

Thanks for the quick response, sorry for the double posting but a couple hours of googling didn't find the mentioned issue

Sadly, I think we are about to confront ourselves with some typical corporate nonsense, all their documentation (starting from the datasheet) mention as being compliant with the DWARF2 standard. (here and here, for example)

Will try to pull some strings with a internal contact at Microchip and let you know

David Anderson · Answer 3 · Fri Nov 10 2023 04:59:43 GMT+0800 (China Standard Time)

Well, .debug_aranges and .debug_frame and .debug_info
and .debug_pubnames have non-standard content and are unreadable with libdwarf.
libdwarf was first written for DWARF2 and still correctly reads any standard version of DWARF.

.debug_str is formally standard (it is hard to make a .debug_str that is not standard-conformant)
but the content is a series of one -letter or empty strings. Very odd. =====Edit: Looks like the content
might be some 2-byte-per-character encoding?

libdwarf is not going to attempt to read such stuff either.

David Anderson, maintainer of libdwarf.

Seva Alekseyev (he/him) · Answer 4 · Fri Nov 10 2023 05:00:56 GMT+0800 (China Standard Time)

@davea42 : just read the linked issue. It's DWARF, Jim, but not as we know it.

David Anderson · Answer 5 · Fri Nov 10 2023 05:23:53 GMT+0800 (China Standard Time)

Yes. Should have read the link. Sorry. But calling it DWARF2 is just wrong.
Calling it a unique debug format based on DWARF2 might be accurate.

Beam me up, Scotty.

Seva Alekseyev (he/him) · Answer 6 · Tue Nov 14 2023 01:30:31 GMT+0800 (China Standard Time)

This structure starts making a bit of sense if you assume that the logical byte is 16 bits. That said, with normal DWARF datatypes, that would make for a file where every other (physical) byte is zero. In this issue's binary, there are no nonzero odd bytes in debug_info, but in the larger binary in issue #473, there are some - always in the same context, the value of DW_AT_language in DW_TAG_compileunit for CUs produced from assembly sources.

Also, the ELF structures are not like that. ELF parsing in pyelftools doesn't break down on those binaries.

@olimexsmart: simply throwing away odd bytes in the DWARF sections makes for a passable parser :) In this issue's binary, there are no odd nonzero bytes. So with the following monkeypatch, pyelftools works:

from io import BytesIO
from elftools.elf.elffile import ELFFile
from elftools.elf.relocation import RelocationHandler
from elftools.dwarf.dwarfinfo import DWARFInfo, DebugSectionDescriptor

def _read_dwarf_section(self, section, relocate_dwarf_sections):
    # Patch for the XC16 compiler; see pyelftools' #518
    # Vendor flag EF_PIC30_NO_PHANTOM_BYTE: clear means drop every odd byte
    has_phantom_bytes = self['e_machine'] == 'EM_DSPIC30F' and (self['e_flags'] & 0x80000000) == 0

    # The section data is read into a new stream, for processing
    section_stream = BytesIO()
    section_data = section.data()
    section_stream.write(section_data[::2] if has_phantom_bytes else section_data)

    if relocate_dwarf_sections:
        reloc_handler = RelocationHandler(self)
        reloc_section = reloc_handler.find_relocations_for_section(section)
        if reloc_section is not None:
            if has_phantom_bytes:
                # No guidance how should the relocation work - before or after the odd byte skip
                raise DWARFError("This binary has relocations in the DWARF sections, currently not supported.")
            else:
                reloc_handler.apply_section_relocations(
                    section_stream, reloc_section)

    return DebugSectionDescriptor(
            stream=section_stream,
            name=section.name,
            global_offset=section['sh_offset'],
            size=section.data_size//2 if has_phantom_bytes else section.data_size,
            address=section['sh_addr'])

# Main

ELFFile._read_dwarf_section = _read_dwarf_section

with open(filename, "rb") as f:
    e = ELFFile(f)
    di = e.get_dwarf_info()
    # Do what you want to that DWARFInfo

The monkeypatched version is still compatible with vanilla DWARF. There are no DWARF relocations in the binaries that we have, so I've shorted out the relocation logic. Can't sensibly debug.

@eliben: this could be a really small change to the library. :) No way to autotest though, short of introducing a whole another category of tests - against xc16-readelf - just for this case.

Luca Olivieri · Answer 7 · Wed Nov 15 2023 00:05:56 GMT+0800 (China Standard Time)

@sevaa I think you got it right, you can find attached a PDF, the TLDR is the following (from the last page):

To decode the data from our DWARF sections when compiled for PIC24, dsPIC30F, or dsPIC33C/E/F devices; skip
every other byte

and

If an external tool is used to interpret DWARF debugging information created by the XC16 and/or XC-DSC
compilers, the EF_PIC30_NO_PHANTOM_BYTE flag can be used to determine if phantom bytes in DWARF
sections must be recognized and discarded.

MPLAB XC16 and XC-DSC DWARF Differences.pdf

Seva Alekseyev (he/him) · Answer 8 · Wed Nov 15 2023 00:35:49 GMT+0800 (China Standard Time)

@olimexsmart good find. The linked guide quietly assumes that no other compiler will ever generate code for PIC devices, or if it does, it will respect the EF_PIC30_NO_PHANTOM_BYTE bit in the E_FLAGS. Oh well, that's all we've got anyway.

The guide doesn't mention relocations. One wonders whether relocations in the DWARF section should be done before or after throwing away odd bytes. And we don't have any relocations against DWARF in the corpus. Any way to get that from the vendor, please?

If @eliben allows an (almost) automatically untestable piece of functionality, I can PR the even-bytes-only logic into the library.

Luca Olivieri · Answer 9 · Wed Nov 15 2023 23:00:45 GMT+0800 (China Standard Time)

The linked guide quietly assumes that no other compiler will ever generate code for PIC devices, or if it does, it will respect the EF_PIC30_NO_PHANTOM_BYTE bit in the E_FLAGS.

The thing with embedded programming is that everyone kinda builds its stuff on its lonely island and doesn't expect much interaction with the outside world and this kind of things happens all the time sadly.

I'm really not familiar with the relocations concept, will try anyway to ask my contact, but I would not expect anything more, since this document already was sent to me in a "that's all we got" kinda mood

Thanks for the support, I briefly tested your path and seems to work ok

Seva Alekseyev (he/him) · Answer 10 · Wed Nov 15 2023 23:39:13 GMT+0800 (China Standard Time)

https://en.wikipedia.org/wiki/Relocation_(computing)

Some CPU vendors (notably MIPS) go to great lengths to prevent the need for relocations in their binaries. Don't know about what PIC does to that effect.

In theory, an ELF file may contain relocation instructions that point into the DWARF sections. If you are trying to debug a running process where your binary is not loaded at its preferred address, all the absolute addresses in the DWARF data (code locations, data locations) need to be relocated to accommodate the different starting address.

The primary use case of pyelftools, however, seems to be parsing DWARF from a binary at rest. In this scenario, the process of relocation can be safely omitted, since the binary is not technically loaded (that is, not loaded by the OS, for execution), so the notion of its starting address is not applicable.

Whether to apply the relocation entries or not is an optional, True by default flag in the ELFFile constructor.

If the patch works, close the issue.

Luca Olivieri · Answer 11 · Thu Nov 16 2023 16:59:31 GMT+0800 (China Standard Time)

I think that in this case we are falling in the the following case described by the wikipedia article:

Prior to the advent of multiprocess systems, and still in many embedded systems, the addresses for objects were absolute starting at a known location, often zero

Seva Alekseyev (he/him) · Answer 12 · Fri Nov 17 2023 03:38:53 GMT+0800 (China Standard Time)

the addresses for objects were absolute

Could be.

Anyway, were we to roll this into pyelftools proper, can you please compile a "Hello world" type program with XC16 with -g for us to serve as a unit test, and share the binary?

OBTW, does XC16 support -gz=zlib? If it does, please rebuild with that and share that, too.

Luca Olivieri · Answer 13 · Fri Nov 17 2023 17:16:44 GMT+0800 (China Standard Time)

Could this work?
ProvaDemoBoard.X.zip

Seva Alekseyev (he/him) · Answer 14 · Fri Nov 17 2023 22:22:42 GMT+0800 (China Standard Time)

I expected a smaller test binary, but I guess it's an OS-less environment and every binary would kind of carry its own OS. This will work, thanks. Didn't need the whole project tree though, just the ELF file.

What about DWARF section compression, which is done with -gz=zlib in the compiler options? I wonder if decompression should be applied before or after stripping the odd bytes.

There are, potentially, up to three transforms that might happen to the DWARF section contents from an ELF file before the DWARF parser proper starts reading them:

Decompression driven by the SHF_COMPRESSED bit in the sh_flags section header field (this is more of an ELF feature)
Decompression driven by the leading .z in the section name (specific to DWARF)
Relocations

In that order.

The XC16 binaries that we've got so far feature neither of those three. But since we are introducing a fourth transform - stripping of the odd numbered bytes - it would be nice to know where exactly does it fit in the chain of transforms.

That said, the ELF loader on PIC devices might not be as feature rich as the one in Linux proper. Chances are, it doesn't even support section decompression during loading, on account of being rather memory- and compute-limited. But if that's the case, I'd like some kind of confirmation, e. g. an error message from the compiler that compression is not supported.

Luca Olivieri · Answer 15 · Tue Nov 21 2023 02:12:37 GMT+0800 (China Standard Time)

I think it is the latter case:

olli@olligram:~$ /opt/microchip/xc16/v2.10/bin/xc16-gcc -gz=zlib  testXC16.c
elf-cc1: error: unrecognised debug output level "z=zlib"

Let me know if you need some other insights no problem

Seva Alekseyev (he/him) · Answer 16 · Tue Nov 21 2023 02:25:28 GMT+0800 (China Standard Time)

This will do, thank you. If a binary with compressed sections ever surfaces, we'll revisit this. For the time being, I'll leave a comment.

EDIT: patch #522.

Seva Alekseyev (he/him) · Answer 17 · Wed Dec 13 2023 23:30:29 GMT+0800 (China Standard Time)

The PR has landed. Is this enough to close the issue?

Luca Olivieri · Answer 18 · Wed Dec 13 2023 23:43:20 GMT+0800 (China Standard Time)

Yes it is indeed, thanks for your support