google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tcpdump broken for libpcap 1.10+

crappycrypto opened this issue · comments

Description

The gvisor site mentions that tcpdump is working in non-promiscous mode. However since libpcap 1.10.0 tcpdump seems to fail inside gvisor. My guess is that is because of the following entry in the changelog

Linux: Require PF_PACKET support, and kernel 2.6.27 or later
A related issue is #1409

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

No response

Thanks for the report. Could you provide an strace log of runsc. Also could you provide your daemon.conf for the runsc runtime.

Using a debian bullseye container it stops after just a few syscalls.

Tested with

  • tcpdump 4.99.0
  • libpcap 1.11.0
  • runsc release-20210921.0
socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = -1 EPROTONOSUPPORT (Protocol not supported)
socket(AF_UNIX, SOCK_RAW, 0)            = 3
ioctl(3, SIOCETHTOOL, 0x7f7d790a8700)   = -1 EOPNOTSUPP (Operation not supported)
close(3)                                = 0
eventfd2(0, EFD_NONBLOCK)               = 3
socket(AF_PACKET, SOCK_RAW, htons(0 /* ETH_P_??? */)) = -1 EPERM (Operation not permitted)
close(3)                                = 0

The first problem seems to be that creating a AF_SOCKET packet without specifying a protocol is not supported in gvisor. See

func packetSocket(t *kernel.Task, epStack *Stack, stype linux.SockType, protocol int) (*fs.File, *syserr.Error) {

However the linux kernel support 0 as protocol as documented in
https://github.com/torvalds/linux/blob/f40ddce8/Documentation/networking/packet_mmap.rst#L83

int fd = socket(PF_PACKET, mode, 0);

The protocol can optionally be 0 in case we only want to transmit via this socket, which avoids an expensive call to packet_rcv(). In this case, you also need to bind(2) the TX_RING with sll_protocol = 0 set. Otherwise, htons(ETH_P_ALL) or any other protocol, for example.

The libpcap code can be found at https://github.com/the-tcpdump-group/libpcap/blob/fa91341ab7647521c90b3e34c93026725bfb71dd/pcap-linux.c#L2312

I don't see any file named daemon.conf on my machine. Does that mean I use the defaults or am I just looking in the wrong location. Gvisor was installed using apt on debian if that helps.

Using libpcap 1.8.1 (debian buster) works fine, as it creates a socket with

socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)) = 3
ioctl(3, SIOCBONDINFOQUERY, 0x7f4079fda900) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(3, SIOCGIWMODE, 0x7f4079fda940)   = -1 ENOTTY (Inappropriate ioctl for device)
close(3)                                = 0
socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)) = 3
ioctl(3, SIOCGIFINDEX, {ifr_name="lo", }) = 0
ioctl(3, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr={sa_family=ARPHRD_ETHER, sa_data=02:42:ac:11:00:02}}) = 0
stat("/sys/class/net/eth0/wireless", 0x7f4079fda650) = -1 ENOENT (No such file or directory)
ioctl(3, SIOCBONDINFOQUERY, 0x7f4079fda5b0) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(3, SIOCGIWNAME, 0x7f4079fda5f0)   = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(3, SIOCGIFINDEX, {ifr_name="eth0", }) = 0
bind(3, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0

Thanks for the detailed report, let me take a look at it.

If you're seeing EPERM, make sure that both:

  • tcpdump is being run as root
  • runsc is being run with the --net-raw argument. Raw and AF_PACKET sockets are, for security reasons, off by default and need to be explicitly enabled.

Passing 0 to create a write-only socket is not supported, but I believe netstack will let you create a packet socket with protocol 0.

You're right I made a mistake with setting up a minimal repro environment for the bug. Tcpdump is indeed broken with the versions as described above but it crashes with a segfault. (debian bullseye container with tcpdump installed via apt)

Here's the strace failing to setup a ring buffer in libpcap

socket(AF_PACKET, SOCK_RAW, htons(0 /* ETH_P_??? */)) = 4
ioctl(4, SIOCGIFINDEX, {ifr_name="lo", }) = 0
ioctl(4, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr={sa_family=ARPHRD_ETHER, sa_data=02:42:ac:11:00:02}}) = 0
stat("/sys/class/net/eth0/wireless", 0x7fa32e8455b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/sys/class/net/eth0/dsa/tagging", O_RDONLY) = -1 ENOENT (No such file or directory)
ioctl(4, SIOCGIFINDEX, {ifr_name="eth0", }) = 0
bind(4, {sa_family=AF_PACKET, sll_protocol=htons(0 /* ETH_P_??? */), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getsockopt(4, SOL_SOCKET, SO_BPF_EXTENSIONS, 0x7fa32e8456c0, [4]) = -1 ENOPROTOOPT (Protocol not available)
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9b17f83000
getsockopt(4, SOL_PACKET, PACKET_HDRLEN, 0x7fa32e845620, [4]) = -1 ENOPROTOOPT (Protocol not available)
munmap(0x7f9b17f83000, 266240)          = 0
setsockopt(4, SOL_PACKET, PACKET_RX_RING, {tp_block_size=0, tp_block_nr=0, tp_frame_size=0, tp_frame_nr=0}, 16) = -1 ENOPROTOOPT (Protocol not available)

The code then crashes while trying to free oneshot_buffer in pcap_cleanup_linux (This is a bug in libpcap where the buffer is also free'd in the error path of setup_mmapped) (The code crashes at https://github.com/the-tcpdump-group/libpcap/blob/fa91341ab7647521c90b3e34c93026725bfb71dd/pcap-linux.c#L835 )

The real issue is that libpcap now requires a memory mapped ring buffer for receiving the packets. The code which checks support is in init_tpacket and it assumes that ENOPROTO means that the kernel is compiled without packet ring buffer support. See https://github.com/the-tcpdump-group/libpcap/blob/fa91341ab7647521c90b3e34c93026725bfb71dd/pcap-linux.c#L2752

Thus to support the newer libpcap versions support for TPACKET_V2 or TPACKET_V3 is needed.

Sigh. Looks like pcap removed all support for non-mmapped ring buffer in commit the-tcpdump-group/libpcap@7c78bcb :-\

Let me open a bug to support TPACKET_V2. I think v2 is simpler to implement than v3.

I will update our documentation to indicate that we do not support libpcap 1.10+ and users should stick with libpcap1.9 or lower for now.

A friendly reminder that this issue had no activity for 120 days.