btf: CO-RE: types from (other) kmod BTF are not available
tpapagian opened this issue · comments
cilium/ebpf
can detect if a symbol is part of a kmod and load it automatically. This is done here by checking the function name that the user wants to attach in /proc/kallsyms
.
This does not work as expected when we want to relocate a symbol that is part of a kmod but the attach functions is not.
As an example, in my system, overlay
is build as a module, and its BTF resides in /sys/kernel/btf/overlay
. The following program attaches to ovl_inode_lower
, a function from overlay
kmod. cilium/ebpf
will check that and load overlay
's BTF. This will also be used to evaluate bpf_core_type_exists(struct ovl_entry)
, the verifier will remove the infinite loop, and the program will load as expected.
SEC("fentry/ovl_inode_lower")
int BPF_PROG(ovl_inode_lower, struct inode *inode)
{
if (!bpf_core_type_exists(struct ovl_entry))
while(1) {}
return 0;
}
On the other hand, the following program attaches to a function that is not part of any kmod. Thus cilium/ebpf
will not load overlay
's BTF. bpf_core_type_exists(struct ovl_entry)
will evaluate to 0, the infinite loop will not be removed by the verifier and the program fails to load (even the kmod BTF exists).
SEC("fentry/security_file_open")
int BPF_PROG(security_file_open, struct file *file)
{
if (!bpf_core_type_exists(struct ovl_entry))
while(1) {}
return 0;
}
These are tested with these programs with cilium/ebpf v0.15.0 and the output is:
$ sudo ./main
Program ovl_inode_lower loaded successfully
Failed to load program security_file_open with error:
ebpf.NewProgram: load program: invalid argument:
0: R1=ctx() R10=fp0
; int BPF_PROG(security_file_open, struct file *file)
0: (b7) r1 = 0 ; R1_w=0
; if (!bpf_core_type_exists(struct ovl_entry))
1: (55) if r1 != 0x0 goto pc+1 ; R1_w=0
; while(1) {}
2: (05) goto pc-1
2: (05) goto pc-1
2: (05) goto pc-1
2: (05) goto pc-1
2: (05) goto pc-1
2: (05) goto pc-1
infinite loop detected at insn 2
cur state: R1=0 R10=fp0
old state: R1_w=0 R10=fp0
processed 9 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0
A way to overcome this limitation is to manually load the overlay BTF with a function similar to this one and provide its output to ProgramOptions.KernelTypes. This makes the second program to load as well.
I wouldn't consider this a bug. If you are using types from a kmod, but not attaching to a function within that kmod, how should it know where the types come from? There are too many kmods to go searching.
IMHO specifying KernelTypes
is the desired way to handle this.
I am not saying that this is strictly a bug, I would say that this is more of an observation.
But I believe that this could be handled better in cilium/ebpf
. I am using KernelTypes to handle that now and I am using a function similar to https://gist.github.com/tpapagian/ca1f7bf281efe23f284aed6c7910f9be which traverses all BTFs.
I am thinking of adding a new field in ProgramOptions
, such as LoadKernelModules []string
where the user provides the possible kmod BTFs and internally load all the required modules if they are available. Or change KernelModuleTypes
to load those modules at any time and not only when the symbol that the user attach is part of a kmod. I am not fully familiar of the way that cilium/ebpf
handles BTFs but this can possibly done more efficient compared to using this function.
I will check those further and I will create a PR with proposed any changes.
I asked @tpapagian to create an issue because this is real-world proof that people want access to other kmod BTF in CO-RE, something we went around the block a few times when doing the initial implementation. Agree that this isn't strictly a bug, but the lib should make this easier somehow.