schweikert / fping

High performance ping tool

Home Page:https://fping.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fping -n -g x.x.x.x/24 hangs with systemd-resolved

cranderson opened this issue · comments

This is a longstanding issue, but I re-tested with git HEAD (a3f4c57) as of 2023-11-02 with the same results.

When using a Linux distro with systemd-resolved as the DNS resolver:

$ ls -l /etc/resolv.conf
lrwxrwxrwx. 1 root root 37 Nov 10 13:29 /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf

$ cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
search lan

$ systemctl status systemd-resolved
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: active (running) since Fri 2023-11-10 13:28:45 EST; 51s ago

$ resolvectl 
Global
       Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 3 (enp4s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: fdd1:19ae:f1ae:1::1
       DNS Servers: fdd1:19ae:f1ae:1::1 192.168.1.1
        DNS Domain: lan

fping -g 192.168.1.0/24 takes about 10 seconds:

$ /usr/bin/time fping -a -g 192.168.1.0/24 > time-fping-a-g-systemd-resolved.txt 2>&1
0.00user 0.04system 0:09.59elapsed 0%CPU (0avgtext+0avgdata 1536maxresident)k
0inputs+0outputs (0major+124minor)pagefaults 0swaps

fping -n -g 192.168.1.0/24 is so slow that it appears to be hung. I timed it and it took longer than 5 minutes to return any output at all. All the output happened in the last 10 seconds or so of this run:

$ /usr/bin/time fping -n -a -g 192.168.1.0/24 > time-fping-n-a-g-systemd-resolved.txt 2>&1
0.03user 0.14system 5:44.40elapsed 0%CPU (0avgtext+0avgdata 2304maxresident)k
0inputs+0outputs (0major+220minor)pagefaults 0swaps

This doesn't happen if you change /etc/resolv.conf to point to an external (but still local) DNS resolver. These are the same resolvers that systemd-resolved forwards to:

$ cat /etc/resolv.conf
nameserver fdd1:19ae:f1ae:1::1
nameserver 192.168.1.1
options edns0 trust-ad
search lan

$ systemctl status systemd-resolved
○ systemd-resolved.service
     Loaded: masked (Reason: Unit systemd-resolved.service is masked.)
     Active: inactive (dead) since Fri 2023-11-10 13:20:49 EST; 6min ago

$ /usr/bin/time fping -n -a -g 192.168.1.0/24 > time-fping-n-a-g-no-resolved.txt 2>&1
.02user 0.06system 0:09.31elapsed 0%CPU (0avgtext+0avgdata 2560maxresident)k
0inputs+24outputs (0major+182minor)pagefaults 0swaps

I ran strace -tt on it and found 226 lines like these. There were 229 unreachable hosts in my test, so it seems that the unreachable hosts (which have no DNS records) are triggering these errors.

13:58:47.667631 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 135152, MSG_DONTWAIT, NULL, NULL) = 66
13:58:49.167756 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:50.667649 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:52.167538 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:53.667540 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:55.167634 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:56.667612 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:57.917612 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:58:59.417626 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:59:00.917641 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:59:02.417576 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:59:03.917606 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:59:05.417712 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66
13:59:06.917672 recvfrom(5, "{\"error\":\"io.systemd.Resolve.Max"..., 131080, MSG_DONTWAIT, NULL, NULL) = 66

Not that I didn't think the problem was with systemd-resolved, but this pretty much confirms it:

systemd/systemd#28166

I don't think there is anything fping can do about this.

@cranderson: Since this is not caused by fping, can this issue be closed?