iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`biotop` and `biosnoop` do not work under 5.19 kernel due to missing `blk_account_io_start` kprobe

haozhangphd opened this issue · comments

On the latest Arch Linux installation, and Fedora 36 with 5.19 kernel, the kprobes blk_account_io_start as well as __blk_account_io_start are missing, as seen from the /proc/kallsyms file. As a result, both biotop and biosnoop fail to start with the following error messages:
Exception: Failed to attach BPF program b'trace_pid_start' to kprobe b'blk_account_io_start', it's not traceable (either non-existing, inlined, or marked as "notrace")

This is correct but its caused by linux kernel
450b7879e34517c3ebc3a35a53806fe40e60fac2 and its introduced in 5.17 onwards.

Kernel Devs don't guaranty tracing symbols similar to linux ABI. this is not issue with bcc but linux kernel.

there can be workaround reverting commit mentioned above and compiling kernel manually.
If not changing static inline void to void would resolve this.

As this is not related to bcc please close this issue.
If need more clarification do ask for more clarification here.
I will try my best to answer.

I think bcc needs to be kept up-to-date with respect to the kernel, and not the other way around. Manually reverting kernel commits in order to make bcc work is not practical.

In the past bcc has always kept pace with the kernel tracing symbol changes, as evidenced by for example 95c9229 and 97c2076. Thus incompatibility with the latest kernel is indeed a bcc issue.

Hi.

the kprobes blk_account_io_start as well as __blk_account_io_start are missing

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

This is indeed not a solution to the problem you point, but if you need to track down block I/O you can, for the moment, use the CO-RE version instead of the standard one.

Best regards.

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

From the source code, it seems the CO-RE version depends on the same two kprobes __blk_account_io_start or blk_account_io_start as the python version? On newer kernels without either of these two kprobes, it seems I cannot use the CO-RE version as well...

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

From the source code, it seems the CO-RE version depends on the same two kprobes __blk_account_io_start or blk_account_io_start as the python version? On newer kernels without either of these two kprobes, it seems I cannot use the CO-RE version as well...

Sorry, I read too quickly.
After taking a look to above quoted commit, it indeed seems we cannot really do something for this problem...
Maybe another workaround would be to mark as noinline the __ functions so we can probe them.

I tried manually patching the kernel by making __blk_account_io_start noinline, and biotop indeed works. However I don't think this is a practical solution for most of the users, as these bio functions will be unusable for anyone using kernels newer than 5.17. Is there any other kprobe that can be used besides __blk_account_io_start for similar purpose?

Is there any other kprobe that can be used besides __blk_account_io_start for similar purpose?

Out of the blue, I do not think so, but maybe someone here has a better idea than me.
Nonetheless, I think this could be a good contribution to send your patches adding noinline to __ functions to upstream kernel mailing list.
What do you think?

Maybe we can add a tracepoint for it. I am working on it.

I am working on it.

If you need review, you can cc me either here or "flaniel at linux dot microsoft dot com" from upstream kernel mailing list.

Is there a temporary workaround for this issue? also for newer kernels 6.0 and 6.1rc?

Is there a temporary workaround for this issue? also for newer kernels 6.0 and 6.1rc?

Up to my knowledge, there is no temporary workaround.
But you can help review this kernel patch so it can be merged soon.

indeed, tracepoint is the best option here. There might be an alternative to blk_account_to_io_start() but it might produce a different result or that function might be inlined as well. So ultimately, @chenhengqi suggested tracepoint might be the best long term solution. I see one tracepoint block/block_bio_complete, but didn't dig out whether it is an appropriate replacement or not.

Unfortunately, it seems that folio_account_dirtied also has this problem (in libbpf-tools/cachestat) on v6.1:

$ objdump -d page-writeback.o | grep folio_account_dirtied
# got nothing
$ sudo bpftrace -l | grep folio_account_dirtied
# got nothing

cause:

$ sudo ./cachestat 
libbpf: prog 'kprobe_account_page_dirtied': failed to create kprobe 'account_page_dirtied+0x0' perf event: No such file or directory
libbpf: prog 'kprobe_account_page_dirtied': failed to auto-attach: -2
failed to attach BPF programs

Sorry, my mistake, there is a tracepoint:writeback:writeback_dirty_folio, i closed PR(#4482), and i'll submit a new one.