Importing executable files with auto-detected format exports corrupted binaries

Question

Importing executable files with auto-detected format exports corrupted binaries

rszibele opened this issue 5 years ago · comments

Describe the bug
If you import an ELF binary with the format as Executable and Linking Format (ELF) and then export that binary, it creates a corrupted binary that segfaults.

However, if you import it as "Raw binary" and manually select the language, then the exported file works as expected.

To Reproduce
Steps to reproduce the behavior:

Import the cp ELF binary into your project (default settings).
Right-Click it and click Export...
Select Binary as the format.
Export it.
Make the exported binary executable.
Run the exported binary.

Expected behavior
The exported binary should work instead of segfaulting (happens with multiple binaries that I've tested).

Screenshots
Default:

Import as Raw binary:

Environment (please complete the following information):

OS: Kubuntu 18.10
Ghira Version 9.0

Additional context
Happens with both i386 and x86_64 binaries.

woachk · Answer 1 · Wed Mar 06 2019 23:45:05 GMT+0800 (China Standard Time)

Issue affects ARM binaries too.

Richard Szibele · Answer 2 · Thu Mar 07 2019 00:54:14 GMT+0800 (China Standard Time)

Just tested it on Windows and it's the same story with PE executables. They also need to be imported raw, else you can't run them.

Rafael Ristovski · Answer 3 · Thu Mar 07 2019 01:09:34 GMT+0800 (China Standard Time)

Can confirm. The exported ls becomes ~3kb larger than the original.
Some binaries also end up with missing headers.

Rafael Ristovski · Answer 4 · Thu Mar 07 2019 03:36:01 GMT+0800 (China Standard Time)

Note: Loading as Raw Binary does not seem to produce the same analysis/disassembly output at all.

The disassembly after analysis is missing various things ranging from functions not being disassembled to missing Xrefs.

It also appears to analyze the binary much faster than when imported as ELF indicating it is either unable to analyze is as thoroughly when loaded as a Raw Binary or just skips a bunch of analysis options which are enabled/supported only under ELF.

Thus, loading as Raw Binary should not be considered a workaround for this as the output differs.

Deleted user · Answer 5 · Thu Mar 07 2019 21:29:28 GMT+0800 (China Standard Time)

The binary export is not intended to create a valid executable - there is no export for doing this. It simply dumps the memory blocks that exist within Ghidra void of any address placement information.

Rafael Ristovski · Answer 6 · Thu Mar 07 2019 23:22:28 GMT+0800 (China Standard Time)

@gnooby22 Would be nice if there was a way to imitate the behavior when loading a binary as Raw Binary (which when exported creates a valid verbatim copy of the loaded program) but retaining all the analysis options when loading it as ELF.

One would expect that Export Program as Binary would create a valid executable when loaded as ELF as well.

This means that executables as of now can't be nicely patched like in IDA.

Bruno Cabral · Answer 7 · Fri Mar 08 2019 04:47:58 GMT+0800 (China Standard Time)

Yeah,
Working exported binaries is a very important feature for many workflows.

Angelo Delli Santi · Answer 8 · Sat Mar 09 2019 01:06:35 GMT+0800 (China Standard Time)

Once I imported as Raw Binary what can I do to produce the same binary analysis of an ELF? At least to automatically show assembly code.

Rafael Ristovski · Answer 9 · Sat Mar 09 2019 01:42:06 GMT+0800 (China Standard Time)

@Corallo It sadly does not seem to be possible as of now

Woodstock · Answer 10 · Wed Mar 20 2019 19:00:32 GMT+0800 (China Standard Time)

Also have this issue.

When I make patches to apps, I can't export a binary without seg fault, nor does it make the change to the underlying binary referenced by the project.

looter · Answer 11 · Mon Apr 01 2019 01:17:31 GMT+0800 (China Standard Time)

Same issue here with every binary I have tested.

Woodstock · Answer 12 · Mon Apr 01 2019 04:57:56 GMT+0800 (China Standard Time)

Does 9.0.1 fix this?

Naja Melan · Answer 13 · Mon Apr 01 2019 05:03:39 GMT+0800 (China Standard Time)

@johnalanwoods Don't worry, it won't be long. Imagine all these poor agents at the NSA who can't work because their hacking toy is broken. They won't let this linger long. LOL

Rafael Ristovski · Answer 14 · Mon Apr 01 2019 07:46:06 GMT+0800 (China Standard Time)

@najamelan it does not appear to be a bug

One of the devs (whose account is now deleted) mentioned this:

The binary export is not intended to create a valid executable - there is no export for doing this. It simply dumps the memory blocks that exist within Ghidra void of any address placement information.

I doubt this will be "fixed"

Woodstock · Answer 15 · Mon Apr 01 2019 14:52:54 GMT+0800 (China Standard Time)

@Ristovski interesting. To me this would seem like a basic feature. The ability to edit the binary and run that binary independently of Ghidra. I’m shocked this is seen as normal behaviour. Even IDA does this.

Rafael Ristovski · Answer 16 · Mon Apr 01 2019 18:01:19 GMT+0800 (China Standard Time)

@johnalanwoods I agree. I don't see why the binaries loaded as ELF should not export the same way they do when loaded as Raw Binary. I haven't looked into the bundled source code yet, but maybe this could be trivial to fix.

Ryan Kurtz · Answer 17 · Mon Apr 01 2019 20:48:49 GMT+0800 (China Standard Time)

A few things to add:

This was not addressed in 9.0.1. The effort is most likely too large to appear in a patch release.
I'm changing this from a bug to an enhancement because while the feature isn't working as requested, it's working as documented (see help page for exporting files).
#154 is related, because the file offset to memory address mapping is lost during import. Keeping track of that mapping would be required to properly go back to file-form.

Woodstock · Answer 18 · Mon Apr 01 2019 21:27:56 GMT+0800 (China Standard Time)

Thank you for the clarification @ryanmkurtz.

It seems unusual to me that, as sophisticated as Ghidra is, it doesn't include this feature.

How much utility can there be without being able to generate an edited executable?

Anyway, not to worry.

Regards,
John

valentinbreiz · Answer 19 · Fri Apr 05 2019 08:56:46 GMT+0800 (China Standard Time)

Same problem, works with raw binary, doesn't with auto detection.

Deleted user · Answer 20 · Tue Apr 09 2019 05:41:15 GMT+0800 (China Standard Time)

So what are people supposed to do without the ability to export a working binary for windows? Re-write the program or? Isn't this the point of reversing an EXE or am I missing something that people are doing better than patching a working binary?

Woodstock · Answer 21 · Fri Apr 12 2019 20:42:14 GMT+0800 (China Standard Time)

I agree with @bernky, there are things which can be done without export. Such as exfiltration of data and there is python scripting etc, but still the ability to export is very important.

mohab · Answer 22 · Fri Apr 19 2019 06:56:19 GMT+0800 (China Standard Time)

alright then, i can't finish my root-me assignments using ghidra, what i should do without export, use other tools, what is the point, this is high priority bug.

Deleted user · Answer 23 · Sun May 05 2019 12:53:44 GMT+0800 (China Standard Time)

Great find, almost lost my mind trying to figure it out.

Deleted user · Answer 24 · Mon May 06 2019 11:18:32 GMT+0800 (China Standard Time)

alright then, i can't finish my root-me assignments using ghidra, what i should do without export, use other tools, what is the point, this is high priority bug.

they are considering this as an "Enhancement feature"

Woodstock · Answer 25 · Mon May 06 2019 15:15:20 GMT+0800 (China Standard Time)

I just went back to IDA, at least I can export from it easily and I don’t have to run it inside 4 nested VMs to avoid myself being backdoored ;)

Woodstock · Answer 26 · Wed May 15 2019 21:30:48 GMT+0800 (China Standard Time)

As others have said export works fine when you import as Raw Binary.

Raw badly affects the quality of the disassembly. It makes it really hard to work in.

…

On 15 May 2019, at 1:53 p.m., benjaminkoffel ***@***.***> wrote: As others have said export works fine when you import as Raw Binary. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19?email_source=notifications&email_token=ABHLP4VVQ5LRUIFZCQVIBHDPVQBUDA5CNFSM4G4ADAC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVOR2FA#issuecomment-492641556>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABHLP4Q5YOBDBPDQ6X5IGCLPVQBUDANCNFSM4G4ADACQ>.

Woodstock · Answer 27 · Fri May 24 2019 01:14:06 GMT+0800 (China Standard Time)

Does 9.0.4 fix this?

Ryan Kurtz · Answer 28 · Fri May 24 2019 01:21:41 GMT+0800 (China Standard Time)

No, 9.0.4 is mostly a bug fix release.

Anıl Karagenç · Answer 29 · Wed Aug 14 2019 08:41:26 GMT+0800 (China Standard Time)

The problem still persists, are you gonna solve it?

Ryan Kurtz · Answer 30 · Wed Aug 14 2019 20:00:06 GMT+0800 (China Standard Time)

Ghidra 9.1 will add the ability to retain and access the original imported program bytes. This was a key requirement for this type of exporter to be written properly. However, Ghidra 9.1 will not introduce any new exporters. Writing an exporter to take a loaded memory image back to a runnable binary is a pretty sizeable task (to do it correctly and completely), and it is specific to each loader (there is no generic solution). However, now that the infrastructure is in place, you might start seeing the community take a stab at it for the more popular file formats (PE, ELF).

Brandon Ros · Answer 31 · Mon Sep 30 2019 12:00:21 GMT+0800 (China Standard Time)

Will it be possible to put a header on top of a raw binary file? Use case:

$ qemu-system-tricore -M tricore_testboard -kernel firmware.bin

firmware.bin won't work, but if you slap an ELF header on the raw instructions, they will parse.

schlafwandler · Answer 32 · Tue Dec 03 2019 01:01:03 GMT+0800 (China Standard Time)

I have written a python script to write back small patches to a copy of the original PE/ELF binary:
https://github.com/schlafwandler/ghidra_SavePatch

It's still experimental and far away from a complete export feature; but if you are only dealing with few and small modified locations it might be a good enough workaround.

Ghidra1 · Answer 33 · Thu Feb 27 2020 03:18:36 GMT+0800 (China Standard Time)

Note that the Binary export is not broken, it is simply misunderstood. This exporter simply dumps the initialized memory blocks defined within Ghidra in binary form. The blocks are appended sequentially. It was never intended to recreate a loadable/executable binary. While this is certainly a desirable feature, it does not yet exist within Ghidra.

The binary exporter can provide a means of exporting a selected memory region as binary such that it can be subsequently added to another Ghidra program. This can be useful if a specific section needs to be unpacked and loaded to a specific memory address within Ghidra.

Woodstock · Answer 34 · Thu Feb 27 2020 05:53:49 GMT+0800 (China Standard Time)

@ghidra1 understood thanks.

However this means Ghidra can’t be used to patch a binary. Which is the primary reason I use IDA.

Otherwise the use is restricted to inspecting methods and watching output.

Ghidra1 · Answer 35 · Sat Feb 29 2020 07:37:28 GMT+0800 (China Standard Time)

The request is a reasonable improvement, although as an analysis tool re-writing binaries has not been a high priority.

I am closing this ticket and deferring the issue to #1505 as an enhancement/improvement.

Ryan Kurtz · Answer 36 · Wed Mar 24 2021 19:37:56 GMT+0800 (China Standard Time)

#1505 was merged in, so there are now PE and ELF exporters that behave how you thought the Binary exporter should have behaved. No changes were made to the Binary exporter...that one still just dumps the memory image to a file as-is. The PE and ELF exporters can be used to save basic byte modifications back to a new runnable PE/ELF file.

Here is the help for the PE exporter: