yegord / snowman

Snowman decompiler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A bug when decompile inline data

monkbai opened this issue · comments

I found the result is suspicious while decompiling a simple example with snowman. I think I should report it here.
Example Code:

1    #include <unistd.h>
2    #include <stdio.h>
3    #include <string.h>
4    int main()
5    {
6        char buf[100] = "hello and welcome~.\n";
7        write(1, buf, strlen(buf));
8        return 0;
9    }

Compiled with gcc 7.3.0:

gcc -fno-stack-protector -no-pie -m32 ./hello.c -o hello32

And the line 6 was compiled into:

8048469:	call dword 0x8048390
804846e:	add ebx, 0x1b92
8048474:	mov dword [ebp-0x7c], 0x6c6c6568
804847b:	mov dword [ebp-0x78], 0x6e61206f
8048482:	mov dword [ebp-0x74], 0x65772064
8048489:	mov dword [ebp-0x70], 0x6d6f636c
8048490:	mov dword [ebp-0x6c], 0xa2e7e65
8048497:	mov dword [ebp-0x68], 0x0
804849e:	lea edx, [ebp-0x64]
80484a1:	mov eax, 0x0
80484a6:	mov ecx, 0x13
80484ab:	mov edi, edx
80484ad:	rep stosd 

It semms the hello string was compiled into inline data(from 0x8048474 to 0x8048497), and snowman failed to recover this string. This is the main function generated by snowman:

int32_t main() {
    void* ebp1;
    int32_t ecx2;
    int32_t eax3;

    ebp1 = reinterpret_cast<void*>((reinterpret_cast<uint32_t>(__zero_stack_offset()) & 0xfffffff0) - 4 - 4);
    __x86_get_pc_thunk_bx();
    ecx2 = 19;
    while (ecx2) {
        --ecx2;
    }
    eax3 = fun_8048300(reinterpret_cast<uint32_t>(ebp1) - 0x7c);
    fun_8048320(1, reinterpret_cast<uint32_t>(ebp1) - 0x7c, eax3, 0x804846e);
    return 0;
}

Yes, current heuristic recovers only strings at constant addresses.
Unfortunately, there is no place from which the decompiler could get the (full) stack content at the moment of the function call, so, it is not trivial to fix in the current design.