avast / retdec

RetDec is a retargetable machine-code decompiler based on LLVM.

Home Page:https://retdec.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

C++ x64 binary decompilation error (simple sum method)

ogre2007 opened this issue · comments

commented

Retdec version: v4.0-438-gaecb4d08
Platform: Ubuntu x64
Binary: x64 ELF
Compiler: GCC 9.4.0

Consider simple C++ program:

#include <stdio.h>

class A {
    public:
        int a;
        int b;
        A(int aparam, int bparam) {a = aparam; b=bparam;}
        int sum(){return a + b;}
};

int main() {
    A a1(1, -1);
    printf("%d\n", a1.sum());
}

Method A::sum compiles with gcc in:

; function: _ZN1A3sumEv at 0x1212 -- 0x122f
0x1212:   f3 0f 1e fa                   endbr64
0x1216:   55                            push rbp
0x1217:   48 89 e5                      mov rbp, rsp
0x121a:   48 89 7d f8                   mov qword ptr [rbp - 8], rdi
0x121e:   48 8b 45 f8                   mov rax, qword ptr [rbp - 8]
0x1222:   8b 10                         mov edx, dword ptr [rax] ; A::a
0x1224:   48 8b 45 f8                   mov rax, qword ptr [rbp - 8]
0x1228:   8b 40 04                      mov eax, dword ptr [rax + 4] ; A::b
0x122b:   01 d0                         add eax, edx ; a + b
0x122d:   5d                            pop rbp
0x122e:   c3                            ret

And lifts with retdec in:

define i64 @_ZN1A3sumEv(i64* %result) local_unnamed_addr {
dec_label_pc_1212:
  %0 = alloca i64
  %1 = load i64, i64* %0 ; int64_t local
  %2 = ptrtoint i64* %result to i64
  %3 = trunc i64 %1 to i32 ; (int32_t) local
  %4 = add i64 %2, 4, !insn.addr !38
  %5 = inttoptr i64 %4 to i32*, !insn.addr !38
  %6 = load i32, i32* %5, align 4, !insn.addr !38
  %7 = add i32 %6, %3, !insn.addr !39 ; A::b + (int32_t) local
  %8 = zext i32 %7 to i64, !insn.addr !39 
  ret i64 %8, !insn.addr !40
}

This function is then decompiled to:

// Address range: 0x1212 - 0x122f
// Demangled:     A::sum()
int64_t _ZN1A3sumEv(int64_t * result) {
    int32_t v1 = *(int32_t *)((int64_t)result + 4); // 0x1228
    int64_t v2; // 0x1212
    return v1 + (int32_t)v2;
}

which is obviously wrong.
As you can see, the error originates from lifted IR, where instructions which dereference the pointer result+0h just not generated.

It looks like the decompiler is not correctly handling the member variables of the A class in the lifted IR. The A class has two member variables, a and b, and the sum method is supposed to return their sum. However, the decompiled code is not correctly accessing these member variables and is instead using a local variable v2 that is never initialized.

One possible reason for this error could be that the decompiler is not correctly identifying the class layout and is not able to correctly determine the offsets of the member variables within the class. It is also possible that there is some other issue with the lifted IR that is causing the decompiler to generate incorrect code.

To troubleshoot this issue, it might be helpful to examine the lifted IR in more detail and see if there are any clues as to what might be causing the issue. You might also try using a different decompiler or disassembler to see if you get different results. Additionally, you could try compiling and decompiling a simple test case that only includes a single class with a single member variable and see if the decompiler is able to correctly handle that case. This can help narrow down the cause of the issue.