Potential fd leak in xdp_program__is_attached
AdrianJab opened this issue · comments
Hello.
We have a watchdog class (using libxdp), which periodically calls xdp_program__is_attached in order to check if our XDP programs are still attached on sockets. The watchdog fires every second, and every second it creates two new files, which we can track with lsof
command.
Example:
data_engi 336030 root 8u a_inode 0,14 0 12569 [eventpoll]
data_engi 336030 root 9u sock 0,8 0t0 166896037 protocol: XDP
data_engi 336030 root 10u sock 0,8 0t0 166896052 protocol: XDP
data_engi 336030 root 11u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 12r CHR 246,3 0t0 115 /dev/ptp3
data_engi 336030 root 13u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 14u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 15r a_inode 0,14 0 12569 btf
data_engi 336030 root 16u a_inode 0,14 0 12569 bpf-map
data_engi 336030 root 17u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 18u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 19u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 20r a_inode 0,14 0 12569 btf
data_engi 336030 root 21u a_inode 0,14 0 12569 bpf-map
data_engi 336030 root 22u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 23u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 24u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 25u a_inode 0,14 0 12569 [eventfd]
data_engi 336030 root 26u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 27u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 28u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 29u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 30u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 31u a_inode 0,14 0 12569 bpf-prog
data_engi 336030 root 32u a_inode 0,14 0 12569 bpf-prog
Every call to xdp_program__is_attached creates two new nodes with name bpf-prog
.
Number of file descriptors grows at the same pace, by two per second (watched by ls /proc/$pid/fd/ | wc -l
).
In the long run it leads to "Too many open files" errors in our system.
If i remove the watchdog loop, the creation of new files stops, which shows that there is no problem with loading/unloading XDP programs, just with xdp_program__is_attached method called periodically.
After running valgrind (valgrind -q --tool=none --track-fds=yes
) on a minimal program which only loads xdp prog and calls is_attached we can see that there are exactly four unclosed files after the program finishes. Two left from detach method and two from is_attached method.
==114843== FILE DESCRIPTORS: 7 open (3 std) at exit.
==114843== Open file descriptor 12:
==114843== at 0x4CB9A3D: syscall (syscall.S:38)
==114843== by 0x1590E0: bpf_obj_get_opts (bpf.c:75)
==114843== by 0x152054: xdp_program__from_pin (libxdp.c:1449)
==114843== by 0x1543AC: xdp_multiprog__link_pinned_progs (libxdp.c:2307)
==114843== by 0x15487E: xdp_multiprog__fill_from_fd (libxdp.c:2409)
==114843== by 0x154A78: xdp_multiprog__from_fd (libxdp.c:2454)
==114843== by 0x154B79: xdp_multiprog__from_id (libxdp.c:2491)
==114843== by 0x154E6B: xdp_multiprog__get_from_ifindex (libxdp.c:2585)
==114843== by 0x15322D: xdp_program__detach_multi (libxdp.c:1904)
==114843== by 0x1537A8: xdp_program__detach (libxdp.c:2041)
...
==114843==
==114843== Open file descriptor 4:
==114843== at 0x4CB9A3D: syscall (syscall.S:38)
==114843== by 0x159EDF: bpf_prog_get_fd_by_id_opts (bpf.c:75)
==114843== by 0x154AD6: xdp_multiprog__from_id (libxdp.c:2474)
==114843== by 0x154E6B: xdp_multiprog__get_from_ifindex (libxdp.c:2585)
==114843== by 0x15322D: xdp_program__detach_multi (libxdp.c:1904)
==114843== by 0x1537A8: xdp_program__detach (libxdp.c:2041)
...
==114843== by 0x11C67F: main (main.cpp:60)
==114843==
==114843== Open file descriptor 6:
==114843== at 0x4CB9A3D: syscall (syscall.S:38)
==114843== by 0x1590E0: bpf_obj_get_opts (bpf.c:75)
==114843== by 0x152054: xdp_program__from_pin (libxdp.c:1449)
==114843== by 0x1543AC: xdp_multiprog__link_pinned_progs (libxdp.c:2307)
==114843== by 0x15487E: xdp_multiprog__fill_from_fd (libxdp.c:2409)
==114843== by 0x154A78: xdp_multiprog__from_fd (libxdp.c:2454)
==114843== by 0x154B79: xdp_multiprog__from_id (libxdp.c:2491)
==114843== by 0x154E6B: xdp_multiprog__get_from_ifindex (libxdp.c:2585)
==114843== by 0x1503C8: xdp_program__is_attached (libxdp.c:643)
...
==114843== by 0x11C639: main (main.cpp:58)
==114843==
==114843== Open file descriptor 3:
==114843== at 0x4CB9A3D: syscall (syscall.S:38)
==114843== by 0x159EDF: bpf_prog_get_fd_by_id_opts (bpf.c:75)
==114843== by 0x154AD6: xdp_multiprog__from_id (libxdp.c:2474)
==114843== by 0x154E6B: xdp_multiprog__get_from_ifindex (libxdp.c:2585)
==114843== by 0x1503C8: xdp_program__is_attached (libxdp.c:643)
...
==114843== by 0x11C639: main (main.cpp:58)
==114843==
==114843==
Hi all,
I worked together with @AdrianJab
After some investigation it turns out that there is problem with duplicating FD:
static int xdp_program__fill_from_fd(struct xdp_program *xdp_prog, int fd)
{
struct bpf_prog_info info = {};
__u32 len = sizeof(info);
struct btf *btf = NULL;
int err = 0, prog_fd;
if (!xdp_prog)
return -EINVAL;
/* Duplicate the descriptor, as we take ownership of the fd below */
prog_fd = fcntl(fd, F_DUPFD_CLOEXEC, MIN_FD);
In the function we duplicate desrpitor but we don't close the duplicated one anywhere and we loose handle for it as we start using the duplicate. I don't know yet how to fix it, as in other cases function:
xdp_program__clone
thats expected behavior.
I proposed a pull request here -> #345