jonhoo / inferno

A Rust port of FlameGraph

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

flamegraph does not contain sys_enter_* calls with params

SergejIsbrecht opened this issue · comments

Version

Inferno 0.11.18

Issue

When using Inferno to process sampled perf script output, some stacktraces a missing in a generated flamegraph.svg

perf record -F499 -e 'syscalls:sys_enter_*' --call-graph dwarf -- command
perf script > out

The generated flamegraph only contains sys calls without any parameters. For example:

File: out

XXXXXX XXXXX [005] 44427.957866: syscalls:sys_enter_sched_yield: 
	      xxxxxxxxxx __sched_yield (/gpl/lib/libc)
	      ...
          xxxxxxxxxx [unknown] (/gpl/lib/libpthread)
	      xxxxxxxxxx [unknown] (/gpl/lib/libc)
XXXX XXXX [005] 44422.848908: syscalls:sys_enter_getuid: 
	      XXXXXXXXXX getuid (/gpl/lib/libc)
...
	      XXXXXXXXXX __libc_start_main (/gpl/lib/libc)
	      XXXXXXXXXX _start (/opt/hmi/native_image/command)

This calls are visible in the flamegraph. On the contrary following entries are missing in the flamegraph.

XXXXXXXXX XXXXX [001] 44422.868561: syscalls:sys_enter_munmap: addr: 0xXXXXX, len: 0xXXXXX
	      XXXXXXX __munmap (/gpl/lib/libc)
          ....
	      XXXXXXX [unknown] (/gpl/lib/libpthread)
	      XXXXXXX [unknown] (/gpl/lib/libc)

Output: inferno-collapse-perf

perf script

XXXXXXXXXXXX XXXXX [003] 44422.850672: syscalls:sys_enter_sched_getaffinity: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
	      XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
	      XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
          ....
	      XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
	      XXXXXXXXXX [unknown] (/gpl/lib/libc)

inferno-collapse-perf

[WARN  inferno::collapse::perf::logging] Weird stack line: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
...
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX [unknown] (/gpl/lib/libc)

Perf

perf version 4.4

FlameGraph

Seems like FlameGraph by Brendan Gregg is able to pars perf script properly

  • ./stackcollapse-perf.pl
  • ./flamegraph.pl.

Example

perf 19953 [001] 33076.615333:                   syscalls:sys_enter_ioctl: fd: 0x0000040e, cmd: 0x00002400, arg: 0x00000000
	          112d8f [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6)
	          30bc56 __evlist__enable+0x96 (inlined)
	          26c177 __cmd_record+0x24b7 (inlined)
	ffffffffffffffff [unknown] ([unknown])

pipewire-pulse  2110 [000] 33076.615351:                syscalls:sys_enter_recvfrom: fd: 0x0000001a, ubuf: 0x56449a86f51c, size: 0x00000014, flags: 0x00000040, addr: 0x00000000, addr_len: 0x00000000
	          12063c __libc_recv+0x1ec (inlined)
	          12063c __libc_recv+0x1ec (inlined)

pipewire-pulse  2110 [000] 33076.615356:              syscalls:sys_enter_epoll_wait: epfd: 0x00000004, events: 0x7ffd9af1c170, maxevents: 0x00000020, timeout: 0xffffffff
	          11eb66 epoll_wait+0x106 (inlined)
	            8a22 [unknown] (/usr/lib/x86_64-linux-gnu/spa-0.2/support/libspa-support.so)
	    56449a7bb8df [unknown] ([heap])

Oh interesting! This suggests we have a corner-case to handle in

let mut by_colons = line[end..].splitn(3, ':').skip(1);
let event = by_colons
.next()
.and_then(|has_event| has_event.rsplit(' ').next());
if let Some(event) = event {
if let Some(ref event_filter) = self.event_filter {
if event != event_filter {
self.stack_filter = StackFilter::Skip;
return;
}
} else {
// By default only show events of the first encountered event type.
// Merging together different types, such as instructions and cycles,
// produces misleading results.
logging::filtering_for_events_of_type(event);
self.event_filter = Some(event.to_string());
}
}
// some event lines _include_ a stack line if the stack only has one frame.
// in that case, the event will be followed by the stack.
let single_stack = if let Some(post_event) = by_colons.next() {
// we need to deal with a couple of cases here:
//
// vote 913 72.176760: 257597 cycles:uppp:
// false 64414 20110.539270: 34467 cycles:u: ffffffff9aa3c8de [unknown] ([unknown])
// false 64414 20110.539270: 34467 cycles: ffffffff9aa3c8de [unknown] ([unknown])
//
// the first should not be handled as a stack, whereas the latter two both should
// the trick is going to be to trim until we encounter a space or a :, whichever
// comes first, and then evaluate from there.
let post_event_start = post_event
.find(|c| c == ':' || c == ' ')
.map(|i| i + 1)
.unwrap_or(0);
let post_event = post_event[post_event_start..].trim();
if !post_event.is_empty() {
// we have a stack!
Some(post_event)
} else {
None
}
} else {
None
};

I suspect it's actually not the arguments that are tripping us up here, but rather the : in the event name (syscalls:sys_enter_ioctl). event ends up getting set to syscalls here

let mut by_colons = line[end..].splitn(3, ':').skip(1);
let event = by_colons
.next()
.and_then(|has_event| has_event.rsplit(' ').next());

and then the remainder (sys_enter_epoll and the arguments) end up in post_event here

// some event lines _include_ a stack line if the stack only has one frame.
// in that case, the event will be followed by the stack.
let single_stack = if let Some(post_event) = by_colons.next() {
// we need to deal with a couple of cases here:
//
// vote 913 72.176760: 257597 cycles:uppp:
// false 64414 20110.539270: 34467 cycles:u: ffffffff9aa3c8de [unknown] ([unknown])
// false 64414 20110.539270: 34467 cycles: ffffffff9aa3c8de [unknown] ([unknown])
//
// the first should not be handled as a stack, whereas the latter two both should
// the trick is going to be to trim until we encounter a space or a :, whichever
// comes first, and then evaluate from there.
let post_event_start = post_event

I'm somewhat short on time these days, so waiting for me to take a look at fixing this would probably take some time, but if you want to take a stab at adding a test case, tracing through the code here to figure out where it gets confused, and then submit a PR with a fix, I'd be happy to try and help guide you through!

I will give it a go, but I write Java most of the time. I have already been looking into the code base, but reading Rust is quite hard to me.

Anyhow, I think it should be easy to fix it after one is familiar with the code base.