BerlinVagrant / vagrant-dns

A plugin to manage DNS records for vagrant environments

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Answer NOTIMPL on AAAA instead of dropping

mattiasb opened this issue · comments

TL;DR

Saying that we don't support AAAA queries by returning NOTIMPL on such queries will make interoperability with other software (notably systemd-resolved) work better. :)

The longer story

When looking for AAAA records from the DNS server in vagrant-dns I get this:

$ dig AAAA mgmt.test @127.0.0.153 -p 5300
;; communications error to 127.0.0.153#5300: timed out
;; communications error to 127.0.0.153#5300: timed out
;; communications error to 127.0.0.153#5300: timed out

; <<>> DiG 9.18.13 <<>> AAAA mgmt.test @127.0.0.153 -p 5300
;; global options: +cmd
;; no servers could be reached

NOTE: I've set VagrantDNS::Config.listen = [[:udp, '127.0.0.153', 5300]] in my Vagrantfile.

Since systemd-resolved (and hence also nss-resolve¹) queries both A and AAAA when resolving a name each name resolution to the vagrant-dns server ends in a timeout on Linux. It usually takes around 10s to resolve a domain name from vagrant-dns for me. I've been ignoring this for a while since I've been so happy to just have something working (and we had the same issues with landrush as well).

Example session with resolvectl query and ping:

$ resolvectl query mgmt.test
mgmt.test: 192.168.122.46

-- Information acquired via protocol DNS in 10.0308s.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

$ time ping -c 1 mgmt.test
PING mgmt.test (192.168.122.46) 56(84) bytes of data.
64 bytes from 192.168.122.46 (192.168.122.46): icmp_seq=1 ttl=64 time=0.302 ms

--- mgmt.test ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.302/0.302/0.302/0.000 ms

real	0m10,106s
user	0m0,000s
sys	0m0,002s

From reading issue #22575 of systemd and specifically this comment it seems that a nicer way to handle not supporting AAAA records would be to return a NOTIMPL return code. That would in turn make the name resolution take milliseconds and just return the A answer.

My guess is that one might need to do work upstream in rubydns to get this working.

NOTE: I'm not suggesting adding support for AAAA (and hence IPv6) btw. :) Just to respond a little bit nicer! :)

1: nss-resolve is the resolved backend for nss that is in turn used by glibc for getaddrinfo etc.

I'll try to make some time to look into this..

should be as simple as:

diff --git a/lib/vagrant-dns/service.rb b/lib/vagrant-dns/service.rb
index 47f5b82..b721d2c 100644
--- a/lib/vagrant-dns/service.rb
+++ b/lib/vagrant-dns/service.rb
@@ -35,6 +35,9 @@ module VagrantDNS
           match(pattern, Resolv::DNS::Resource::IN::A) do |transaction, match_data|
             transaction.respond!(ip, ttl: ttl)
           end
+          match(proc { |name, resource_class| resource_class != Resolv::DNS::Resource::IN::A }) do |transaction, match_data|
+            transaction.fail!(:NotImp)
+          end
         end
 
         otherwise do |transaction|

EDIT: No, that's not quite it, we still need to match the pattern

So I got to looking at this immediately.

One thing I noticed is that there's a risk for a DNS loop on Linux here.

On Ubuntu and Fedora /etc/resolv.conf looks something like this:

nameserver 127.0.0.53
options edns0 trust-ad
search <DOMAIN>.<TLD>

127.0.0.53 in turn is then systemd-resolved. I believe (from looking at the code of Async DNS) that Async::DNS::System.nameservers ends up being 127.0.0.53 on Ubuntu and Fedora.

Given this code:

        otherwise do |transaction|
          transaction.passthrough!(std_resolver) do |reply, reply_name|
            puts reply
            puts reply_name
          end
        end

... if a query fails (for example for AAAA to mgmt.test) then vagrant-dns will forward the query to systemd-resolved which will forward it back to vagrant-dns.

My thinking is that we don't need to forward any requests at all. This will work on Linux since the other DNS servers will be bound to the respective interfaces they are on. Like this:

$ resolvectl 
Global
         Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
  resolv.conf mode: stub
Current DNS Server: 127.0.0.153:5300  ← vagrant-dns
        DNS Servers 127.0.0.153:5300  ← vagrant-dns
         DNS Domain ~test

Link 2 (eno1)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 3 (enp11s0u1u2)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.31.32.101                   ← Link specific DNS server
       DNS Servers: 172.31.32.100 172.31.32.101     ← Link specific DNS server
        DNS Domain: example.com

Link 4 (wlp61s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.31.32.101                   ← Link specific DNS server
       DNS Servers: 172.31.32.100 172.31.32.101     ← Link specific DNS server
        DNS Domain: example.com

Link 6 (virbr0)
Current Scopes: LLMNR/IPv4
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 7 (tap0)
Current Scopes: LLMNR/IPv6
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

And I believe that on MacOS, since you just map a TLD to a specific DNS server, you should be fine there as well. Well, unless you're using a tld like .com and expect to be able to reach my-local-machine.com via vagrant-dns and then all other .com addresses via passthrough. Is that a setup you want to support?

The reason I'm asking is that this little diff solves the issue with the 10s timeout for me:

diff --git a/lib/vagrant-dns/service.rb b/lib/vagrant-dns/service.rb
index 47f5b82..26c9faf 100644
--- a/lib/vagrant-dns/service.rb
+++ b/lib/vagrant-dns/service.rb
@@ -27,7 +27,6 @@ module VagrantDNS
       end
 
       registry = Registry.new(tmp_path).to_hash
-      std_resolver = RubyDNS::Resolver.new(Async::DNS::System.nameservers)
       ttl = VagrantDNS::Config.ttl
 
       RubyDNS::run_server(VagrantDNS::Config.listen) do
@@ -37,11 +36,12 @@ module VagrantDNS
           end
         end
 
+        match(//, Resolv::DNS::Resource::IN::A) do |transaction|
+          transaction.fail!(:NXDomain)
+        end
+
         otherwise do |transaction|
-          transaction.passthrough!(std_resolver) do |reply, reply_name|
-            puts reply
-            puts reply_name
-          end
+          transaction.fail!(:NotImp)
         end
       end
     end

To add to my already long comment:

My initial thought was that systemd-resolved would continuously try to ask for a AAAA record from vagrant-dns and eventually timing out because we didn't send a response.

I now believe that the issue actually was a DNS loop. It makes sense since if the passthrough actually forwarded to the system DNS server instead of back to systemd-resolved they would get an NXDOMAIN from there instead.

vagrant-dns allows to hook into public TLDs. And while I have no clue if that is still in use, I wouldn't feel comfortable to remove that feature1.

So here's my proposal:

  1. we make resolver configurable:
  • false: Disable passthrough
  • nil, :system Use system nameservers (default)
  • [ [proto, ip, port ], ["udp", "1.1.1.1", 53] ]: list of servers to use
  1. we match all queries two times, first against A for our positive match, than again without class restriction for NOTIMP

Providing a custom upstream DNS server should be helpful in any way.
Non-A-queries for configured patterns will return NOTIMP, while you can still passthrough everything else.

Footnotes

  1. Mind that .DEV was used for quite some time, until it became a public TLD, and I've seen people still use that.

released in v2.4.0