Display More Symbols in Trace
Nokel81 opened this issue · comments
Trace already has access to symbols, but it would be very useful to be able see the label names when they are used (in the assembly) for jumps, loads, and addressing.
Example:
.s:
mov eax, [some_label + 4]
...
some_label:
dd 3
dd 6
usercorn trace:
L some_label: 0x804a680
mov eax, dword ptr [0x804a680 + 4]
0x8048621: push ebp | esp = 0xbffff7a8 | W bffff7a8
W 0xbffff7c8: f8f7ffbf [.... ] W
R 0xbffff7d0: 74000000 0cb00408 [t....... ] R
W 0xbffff7bf: 74 [t ] W
W 0xbffff7ac: f6860408 [.... ] W
0x8048622: mov ebp, esp | ebp = 0xbffff7a8
0x8048624: push edi | esp = 0xbffff7a4 | W bffff7a4
0x8048625: push esi | esp = 0xbffff7a0 | W bffff7a0
0x8048626: mov esi, ecx | esi = 0x00000001
0x8048628: push ebx | esp = 0xbffff79c | W bffff79c
0x8048629: mov ebx, eax | ebx = 0x0804b00c
+ ebx = _stdout
0x804862b: sub esp, 0xc | eflags = 0x00000084
+ esp = 0xbffff790
0x804862e: test ecx, ecx | eflags = 0x00000000
0x8048630: je 0x8048684 |
buf_write+0x11
0x8048632: mov eax, dword ptr [eax + 8] | eax = 0x00000000 | R 804b014 (_stdout+0x8)
W 0xbffff79c: 04f8ffbf [.... ] W
R 0x0804b014: 00000000 [.... ] R
0x8048635: mov edi, edx | edi = 0xbffff7bf
0x8048637: lea edx, [ecx + eax] | edx = 0x00000001
0x804863a: cmp edx, 0x1000 | eflags = 0x00000081
0x8048640: jle 0x8048659 |
buf_write+0x21
0x8048659: push edx | esp = 0xbffff78c | W bffff78c
0x804865a: push ecx | esp = 0xbffff788 | W bffff788
0x804865b: push edi | esp = 0xbffff784 | W bffff784
0x804865c: lea eax, [ebx + eax + 0x10] | eax = 0x0804b01c
+ eax = _stdout+0x10
0x8048660: push eax | esp = 0xbffff780 | W bffff780
0x8048661: call 0x804889d | esp = 0xbffff77c | W bffff77c
buf_write+0x45
0x804889d: push ebp | esp = 0xbffff778 | W bffff778
W 0xbffff77c: 66860408 [f... ] W
0x804889e: xor edx, edx | edx = 0x00000000
+ eflags = 0x00000044
0x80488a0: mov ebp, esp | ebp = 0xbffff778
0x80488a2: mov eax, dword ptr [ebp + 8] | | R bffff780
0x80488a5: push ebx | esp = 0xbffff774 | W bffff774
0x80488a6: mov ebx, dword ptr [ebp + 0xc] | ebx = 0xbffff7bf | R bffff784
0x80488a9: cmp edx, dword ptr [ebp + 0x10] | eflags = 0x00000095 | R bffff788
0x80488ac: je 0x80488b7 |
memcpy+0x11
0x80488ae: mov cl, byte ptr [ebx + edx] | ecx = 0x00000074 | R bffff7bf
W 0xbffff774: 0cb00408 [.... ] W
R 0xbffff780: 1cb00408 bff7ffbf 01000000 [............ ] R
R 0xbffff7bf: 74 [t ] R
0x80488b1: mov byte ptr [eax + edx], cl | | W 804b01c (_stdout+0x10)
0x80488b4: inc edx | edx = 0x00000001
+ eflags = 0x00000001
0x80488b5: jmp 0x80488a9 |
memcpy+0x1a
0x80488a9: cmp edx, dword ptr [ebp + 0x10] | eflags = 0x00000044 | R bffff788
W 0x0804b01c: 74 [t ] W
0x80488ac: je 0x80488b7 |
I hacked this together, notice two assignments to ebx (one for the symbol, one for the address) and _stdout+0x10 for the memory write
0x8048629: mov ebx, eax | ebx = 0x0804b00c
+ ebx = _stdout
0x80488b1: mov byte ptr [eax + edx], cl | | W 804b01c (_stdout+0x10)
It would be harder to put symbols in the actual disassembly, because the disassembler backend (capstone) doesn't trivially support symbols.
That is definitely quite good and would be very helpful
https://github.com/lunixbochs/usercorn/tree/sym-ui
Notice the commit was very simple and the stream UI itself is not very much code, feel free to tweak it more and offer suggestions, or even skin the whole stream UI yourself to support some specific thing you're doing :)
Usercorn has a very nice execution trace format that the UI uses (the stream UI doesn't need access to the actual running CPU, it does everything using only the trace!), you can also write your own programs to analyze it.
The file format is here: https://github.com/lunixbochs/usercorn/blob/master/go/models/trace/PROTOCOL.md
You can generate trace files with usercorn run -trace -to filename ./binary
You can run the UI on trace files with usercorn trace -pretty filename
You can convert a trace to JSON if you want to easily parse it in Python or something with usercorn trace -json filename
You can convert a trace to drcov for importing into Lighthouse in IDA or something with usercorn trace -drcov coveragefilename filename
After I reinstalled, I get the following error when I try and run it
Error: Could not identify file magic.
load.go:38 | loader.LoadArch()
load.go:19 | loader.Load()
usercorn.go:167 | go.NewUsercorn()
cmd.go:68 | cmd.NewUsercornCmd.func1()
cmd.go:352 | cmd.(*UsercornCmd).Run()
main.go:10 | run.Main()
launcher.go:45 | cmd.Main()
main.main()
What command are you running?
usercorn run -trace -mtrace -rtrace filename
You only need -trace. -mtrace and -rtrace are implied by trace. filename must not be an executable. Did you overwrite it or something?
It must not be an executable? I think my problem was that is wasn't one. But now I just don't get the new outputs. Is that only for display afterwards or can run -trace
work with them too?
I have run nm
on the executable, and it does output symbols
usercorn run -trace bins/x86.linux.elf
should have the new symbol output
Thanks. I will look at maybe trying to figure out why some of the mem writes are not showing up on the right side later, but it does look that loading into registers does print.
There are two outputs for memory writes - membatch, which is the hexdump, and mtrace, which is the R / W addrs. I would be very surprised if an assembly instruction accessed memory and there wasn't a corresponding mtrace output for it.
So I was talking about mtrace
output. And I know that it is a symbol but it didn't output on the right.
what usercorn outputs:
mov dword ptr [0x804c0fc], eax | | W 804c0fc
what is in the assembly file:
mov [STATIC_Boolean_MAX_VALUE], eax
is STATIC_Boolean_MAX_VALUE an actual symbol that ended up in the file?
usercorn doesn't have any way of seeing the assembly file, it just parses debug information from the executable.
Yes, running nm
on the file I get
0804c0fc D STATIC_Boolean_MAX_VALUE
what is D
"D"
"d" The symbol is in the initialized data section.
You should look into how Go loads symbols from an ELF file and make sure that symbol is there?
You can import fmt and put some prints in the loops in this function https://github.com/lunixbochs/usercorn/blob/master/go/loader/elf.go#L164
In my example, these symbols are in the same area and work fine:
0804b008 D stderr
0804b004 D stdin
0804b000 D stdout
If you run usercorn run -mtrace -v filename
it will show you the memory mapping - make sure the address you're looking at is attributed to the binary?
In my case:
0x8048000-0x804a000 r-x [exe] bins/x86.linux.elf
0x804b000-0x804f000 rw- [exe] bins/x86.linux.elf(0x2000)
0x0804b00c
is within the 0x804b000-0x804f000
mapping, so that is what is queried to look up stdout
If I run ./usercorn run -repl bins/x86.linux.elf
, I can run maps 0x0804b00c
to look up the map as well:
0x8048cb7> maps 0x0804b00c
Memory map:
0x804b000-0x804f000 rw- [exe] bins/x86.linux.elf(0x2000)
0x804b00c: 0x804b000+0xc
And:
0x8048cb7> us:Symbolicate(0x0804b00c, false)
userdata: 0xc420b519b0 "_stdout"