iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bpf_probe_read_user returns error (-14) on Android 11, Kernel 4.14, ARM64

vadimkotov opened this issue · comments

Hi folks,

I'm running BCC on Android 11, Kernel 4.14, ARM64 and bpf_probe_read_user returns an error (-14). As a result opensnoop returns an empty filename string.

I'm aware of #2253 and implemented the solution that supposedly should fix the issue (https://lkml.org/lkml/2019/5/2/1100) to no avail.

#3094 is somewhat related and the solution there looks to be similar to the above (i.e. adding bpf_probe_read_user from the later version).

Has anyone encountered something like that or knows how can I get more information about the issue?

Cheers,
Vadim

Small update, I've found out that despite having bpf_probe_read_user in the program, underneath it still calls bpf_probe_read, which in turn calls probe_kernel_read which results in -EFAULT because it is trying to access a __user pointer.

@vadimkotov For user provided bpf_probe_read_user, bcc only converts to bpf_probe_read if underlying kernel does not support bpf_probe_read_user. bcc decides whether kernel support or not depending on bpf_probe_read_kernel on /proc/kallsyms. Could you check whether this is the case or not in your system?

Hi @yonghong-song, thank you for the reply!

Originally kernel 4.14 didn't have it, but I've added using this patch, and I can see it in kallsyms:

root@localhost:/# cat /proc/kallsyms | grep bpf_probe_read_user
00000000b88f1f50 T bpf_probe_read_user
000000005b789394 r bpf_probe_read_user_proto

Small update, I've found out that despite having bpf_probe_read_user in the program, underneath it still calls bpf_probe_read, which in turn calls probe_kernel_read which results in -EFAULT because it is trying to access a __user pointer.

Hi @vadimkotov
Regarding the issue you mentioned above, I also encountered this before while running BCC on Android 4.14/armhf platform.

I found check_bpf_probe_read_user() of b_frontend_action.cc call bcc_symcache_resolve_name() to identify the kernel provide bpf_probe_read_user helper call or not. The return value of bcc_symcache_resolve_name() is 0 if symbol is found, -1 if symbol is not existed, but here it use bool found to take the return value, so it cause the issue. Hope I didn't mis-understand what's expect here. Does it make sense to you?
check_bpf_probe_read_user()

Sorry for my carelessness, I missed the latest ternary operator.

Hi @ismhong, thank you for pointing to the code!

check_bpf_probe_read_user looks fine to me: the boolean found is assigned true when the return value from bcc_symcache_resolve_name is greater than or equal to zero (see b_frontend_action.cc#L123 ).

Unless I'm missing a subtle error here, this means that bpf_probe_read_user should be successfully resolved as I can see it in proc/kallsyms. One hypothesis is that somehow bpf_probe_read_user is never even passed to check_bpf_probe_read_user. And it seems like the place to start the search from is b_frontend_action.cc#L1097, maybe (and it's speculation at this point) the correct name never makes it to Decl.

It looks like in my case bcc_symcache_resolve_name correctly resolves the symbol. But I don't see the return value of check_bpf_probe_read_user being used further in the code at b_frontend_action.cc#L1097. At this point I'm not sure if check_bpf_probe_read_user has any effect on the final eBPF program other than detecting potential overlapping address space.

Does anyone happen to know a pace in BCC's pipeline where bpf_probe_read_user call could be overwritten and swapped back to bpf_probe_read?

Found this bit: https://github.com/iovisor/bcc/blob/master/src/cc/frontends/clang/b_frontend_action.cc#L1678

If check_bpf_probe_read_kernel returns the fallback function name bpf_probe_read, then probefunc is assigned a set of #define statement, which along with re-defining bpf_probe_read_kernel and bpf_probe_read_kernel_str also re-defines bpf_probe_read_user[_str].

Since I've only ported over bpf_probe_read_user from Kernel 5.5, it can't resolve bpf_probe_read_kernel and bpf_probe_read_user gets re-defined to bpf_probe_read along the way.

If I comment out "#define bpf_probe_read_user bpf_probe_read\n" I get:

52: (85) call unknown#112
invalid func unknown#112

HINT: bpf_probe_read_user missing (added in Linux 5.5)

I'll try porting bpf_probe_read_kernel to my kernel and see if works and update this ticket. But I wonder why commenting out the #define didn't work.

Nope, simply porting over bpf_probe_read_kernel didn't work.

55: (85) call unknown#113
invalid func unknown#113

HINT: bpf_probe_read_kernel missing (added in Linux 5.5).

Yet it is visible on /proc/kallsyms, hm...

root@localhost:/# cat /proc/kallsyms | grep bpf_probe_read
00000000b999e088 T bpf_probe_read
00000000335ef2c3 T bpf_probe_read_user
00000000603879c2 T bpf_probe_read_kernel
00000000265ef128 T bpf_probe_read_str
000000008b6e651c r bpf_probe_read_proto
00000000cd72fd0d r bpf_probe_read_user_proto
00000000416586c9 r bpf_probe_read_kernel_proto
0000000019c1a15e r bpf_probe_read_str_proto

At some point there was a kernel version check in b_frontend_action.cc (link), I wonder, has it been, by any chance moved elsewhere in the code?

Hey folks,

I kind of solved it for my specific case with a simple hack which seems to work. Namely, instead of porting over bpf_probe_read_user I simply expanded bpf_probe_read function (from kernel/trace/bpf_trace.c), so that it looks like this:

BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)                                                                                                                                       
{                                                                                                                                                                                                                  
        int ret;                                                                                                                                                                                                   
                                                                                                                                                                                                                   
        ret = probe_kernel_read(dst, unsafe_ptr, size);                                                                                                                                                            
        if (unlikely(ret < 0)) {
                // Beginning of added code
                ret = probe_user_read(dst, unsafe_ptr, size);
                if (unlikely(ret < 0))
                // end of added code
                        memset(dst, 0, size); 
        }                                                                                                                                                                                                          
                                                                                                                                                                                                                   
        return ret;                                                                                                                                                                                                
} 

Explanation of the code above

Since BCC can't see bpf_probe_read_[user|kernel] when I port them manually, I tried a different idea. I know that in absence of bpf_probe_read_[user|kernel] BCC would fall back to bpf_probe_read. So I changed bpf_probe_read so that if probe_kernel_read fails try reading unsafe_ptr as a user space pointer (i.e. call probe_user_read).

This fixed my issue and now tools like opensnoop display the path correctly.

But the fact that this hack worked tells me, that previously BCC couldn't "see" bpf_probe_read_[user|kernel] despite them being available in the kernel. I don't know if this is something that BCC developers would consider worth fixing so I'm leaving this ticket open for now. Otherwise please feel free to close this ticket. If you have any more questions about the issue or the "fix" I'll be happy to help.

Thanks all who replied to the ticket, it helped me narrow down the issue.

@vadimkotov Thanks for your explanation. It is a little bit unique since you do manual backports. I think you bcc should work if you remove kernel version check in b_frontend_action.cc. But you above kernel change should work too. The reason with in the newer implementation of probe_kernel_read, it will check address space, if it belongs to user space, it will return failure. That is why probe_kernel_read should work in earlier kernel versions, but it will return failure in later kernels if address space belongs to user space.

same problem on Debian9 kernel 4.19 amd64

change to use bpf_probe_read_str ,same result

# cat /proc/kallsyms | grep bpf_probe_read
ffffffff81177440 T bpf_probe_read
ffffffff811774d0 T bpf_probe_read_str
ffffffff81c39ac0 r bpf_probe_read_str_proto
ffffffff81c39cc0 r bpf_probe_read_proto

same problem on Debian9 kernel 4.19 amd64

Could you share what exactly the problem is? A reproducible code is the best so people here can really help you.

same problem on Debian9 kernel 4.19 amd64

Could you share what exactly the problem is? A reproducible code is the best so people here can really help you.

I record my questions here, The problem is not resolved.
cilium/ebpf#419

@davemarchevsky could you help take a look?

@davemarchevsky could you help take a look?

Thank you for your patience, I submitted the demo code, and there will be problems if I run it directly.

https://github.com/hz-kelpie/bpf-demo/tree/main/tracepoint-execve

https://github.com/hz-kelpie/bpf-demo/tree/main/tracepoint-execve

@hz-kelpie Could you try explicitly using bpf_probe_read_user and bpf_probe_read_user_str in your arg loop? If that doesn't work, could you try running the execsnoop.py tool from this repo on your system to see if it fails similarly?

Aside from that, I can help more directly if you modify the BPF program in your example such that I can load/attach it with bpftool instead of using the go wrapper stuff: remove perf buffer, change perf buffer writes to printk, minimize extraneous stuff. A more fleshed out version of the example you gave in cilium/ebpf#419 would be ideal.

https://github.com/hz-kelpie/bpf-demo/tree/main/tracepoint-execve

@hz-kelpie Could you try explicitly using bpf_probe_read_user and bpf_probe_read_user_str in your arg loop? If that doesn't work, could you try running the execsnoop.py tool from this repo on your system to see if it fails similarly?

Aside from that, I can help more directly if you modify the BPF program in your example such that I can load/attach it with bpftool instead of using the go wrapper stuff: remove perf buffer, change perf buffer writes to printk, minimize extraneous stuff. A more fleshed out version of the example you gave in cilium/ebpf#419 would be ideal.

Thank you,you gave me hope again
Using bpf_probe_read_user and bpf_probe_read_user_str is not work
I have thin my demo here, https://github.com/hz-kelpie/bpf-demo/tree/main/testdata
I suspect the process of switching namespace will trigger this problem.

https://github.com/hz-kelpie/bpf-demo/tree/main/tracepoint-execve

@hz-kelpie Could you try explicitly using bpf_probe_read_user and bpf_probe_read_user_str in your arg loop? If that doesn't work, could you try running the execsnoop.py tool from this repo on your system to see if it fails similarly?

Aside from that, I can help more directly if you modify the BPF program in your example such that I can load/attach it with bpftool instead of using the go wrapper stuff: remove perf buffer, change perf buffer writes to printk, minimize extraneous stuff. A more fleshed out version of the example you gave in cilium/ebpf#419 would be ideal.

execsnoop.py can work fine, But I can't use bcc for production environment limit

The linked demo is using a go linker/loader and userspace component which I don't have any experience with. If you could make some small changes to your program such that I could load it using bpftool?
execsnoop.py can work fine, But I can't use bcc for production
environment limit There is an execsnoop in libbpf-tools directory which uses plain libbpf, could you try comparing that functionality to yours in your env?

On Sat, Sep 18, 2021 at 5:26 AM hz-kelpie @.***> wrote: https://github.com/hz-kelpie/bpf-demo/tree/main/tracepoint-execve @hz-kelpie https://github.com/hz-kelpie Could you try explicitly using bpf_probe_read_user and bpf_probe_read_user_str in your arg loop? If that doesn't work, could you try running the execsnoop.py tool from this repo https://github.com/iovisor/bcc/blob/master/tools/execsnoop.py on your system to see if it fails similarly? Aside from that, I can help more directly if you modify the BPF program in your example such that I can load/attach it with bpftool instead of using the go wrapper stuff: remove perf buffer, change perf buffer writes to printk, minimize extraneous stuff. A more fleshed out version of the example you gave in cilium/ebpf#419 <cilium/ebpf#419> would be ideal. execsnoop.py can work fine, But I can't use bcc for production environment limit — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3175 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJJYUGSHYONPJQMGLRDA6LUCRLM3ANCNFSM4T3AD3JQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Thank you for your patience, I will make change to my program later,
Now I have generally ruled out the problem of my project, Because bpftrace meet the same problem.

this is my program print

[INFO] Pid: 9444 <Cmdline> grep -E -v /sty
[ERROR] bpf_probe_read_str occur error, 9446, cmdline: ping -c 3 -w 3 10.0.0.1 > /dev/null
[INFO] Pid: 9446 <Cmdline> ping -c 3 -w 3 10.0.0.1 > /dev/null
[INFO] Pid: 9447 <Cmdline> ping -c 3 -w 3 10.0.0.1
[ERROR] bpf_probe_read_str occur error, 9448, cmdline: nobody /usr/local/nagent/libexec/get_nic_stats.sh
[INFO] Pid: 9448 <Cmdline> nobody /usr/local/nagent/libexec/get_nic_stats.sh
[ERROR] bpf_probe_read_str occur error, 9449, cmdline: bash /usr/local/nagent/libexec/get_nic_stats.sh
[INFO] Pid: 9449 <Cmdline> bash /usr/local/nagent/libexec/get_nic_stats.sh
[INFO] Pid: 9449 <Cmdline> /usr/local/nagent/libexec/get_nic_stats.sh
[INFO] Pid: 9451 <Cmdline> grep -l 1 /sys/class/net/eth0/carrier /sys/class/net/en*/carrier
[INFO] Pid: 9452 <Cmdline> awk -F / {print $5}
[INFO] Pid: 9455 <Cmdline> xargs -n 1
[INFO] Pid: 9456 <Cmdline> xargs -i ls /sys/class/net/{}/statistics/rx_dropped /sys/class/net/{}/statistics/rx_missed_errors /sys/class/net/{}/statistics/rx_crc_errors /sys/class/net/{}/statistics/tx_dropped /sys/class/net/{}/statistics/tx_errors

when i use bpftrace by this way the same time

# bpftrace -e 'tracepoint:syscalls:sys_enter_execve {
printf("ENTER>>>> %d ", pid); join(args->argv);
}'

bpftrace print

ENTER>>>> 9444 grep -E -v /sty
ENTER>>>> 9446
ENTER>>>> 9447 ping -c 3 -w 3 10.0.0.1
ENTER>>>> 9448
ENTER>>>> 9449 bash
ENTER>>>> 9449 /usr/local/nagent/libexec/get_nic_stats.sh
ENTER>>>> 9451 grep -l 1 /sys/class/net/eth0/carrier /sys/class/net/en*/carrier
ENTER>>>> 9452 awk -F / {print $5}
ENTER>>>> 9455 xargs -n 1
ENTER>>>> 9456 xargs -i ls /sys/class/net/{}/statistics/rx_dropped /sys/class/net/{}/statistics/rx_missed_errors /sys/class/net/{}/statistics/rx_crc_errors /sys/class/net/{}/statistics/tx_dropped /sys/class/net/{}/statistics/tx_errors

Clearly, when my program print error log, bpftrace also print blank
I don't know if this is a known problem

this may caused by MTE(Memory Tagging Extension)

e.g.

addr = 0xb400007c81c93f60

after (addr & 0xffffffffff), then bpf_probe_read read success, test from Android 12 kernel 5.10