yqzhang / Nerve

An awesome server-level profiling infrastructure

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Basic PMU profiling by PID

yqzhang opened this issue · comments

Given a list of PIDs and a list of PMU events, profile their values in the past time interval.

Put this here before we forget, the libpfm does not work well initializing the file descriptors for zombie processes.

I tested the code on the server, and it seems the function always fails after 3 iterations. Here is the dump from gdb.

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x15) at malloc.c:2970
2970    malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x15) at malloc.c:2970
#1  0x00007ffff7ad88aa in pfm_perf_terminate (this=0x7ffff7dd07e0) at pfmlib_perf_event_pmu.c:793
#2  0x00007ffff7ad46c6 in pfm_terminate () at pfmlib_common.c:795
#3  0x00000000004037b8 in get_pmu_sample (process_info_list=0x7fffffff0640, events=0x6061a0, sample_interval=1000000) at pmu_sample.c:186
#4  0x0000000000401395 in main (argc=1, argv=0x7fffffffe768) at main.c:101

There is an issue with multi-threaded / multi-processed workloads, that the inherit flag can only work when the new threads or processes are invoked during profiling. For example, if a fork() happens at time t_0, we are not able to get the PMU profiling of the child process at a later time t_1 due to the fact that we are creating new fds every sample interval.

However, it seems perf stat works fine in this case, so maybe we can look into their implementation.