Answer NOTIMPL on AAAA instead of dropping
mattiasb opened this issue · comments
TL;DR
Saying that we don't support AAAA
queries by returning NOTIMPL
on such queries will make interoperability with other software (notably systemd-resolved
) work better. :)
The longer story
When looking for AAAA
records from the DNS server in vagrant-dns
I get this:
$ dig AAAA mgmt.test @127.0.0.153 -p 5300
;; communications error to 127.0.0.153#5300: timed out
;; communications error to 127.0.0.153#5300: timed out
;; communications error to 127.0.0.153#5300: timed out
; <<>> DiG 9.18.13 <<>> AAAA mgmt.test @127.0.0.153 -p 5300
;; global options: +cmd
;; no servers could be reached
NOTE: I've set VagrantDNS::Config.listen = [[:udp, '127.0.0.153', 5300]]
in my Vagrantfile.
Since systemd-resolved
(and hence also nss-resolve
¹) queries both A
and AAAA
when resolving a name each name resolution to the vagrant-dns
server ends in a timeout on Linux. It usually takes around 10s to resolve a domain name from vagrant-dns
for me. I've been ignoring this for a while since I've been so happy to just have something working (and we had the same issues with landrush
as well).
Example session with resolvectl query
and ping
:
$ resolvectl query mgmt.test
mgmt.test: 192.168.122.46
-- Information acquired via protocol DNS in 10.0308s.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network
$ time ping -c 1 mgmt.test
PING mgmt.test (192.168.122.46) 56(84) bytes of data.
64 bytes from 192.168.122.46 (192.168.122.46): icmp_seq=1 ttl=64 time=0.302 ms
--- mgmt.test ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.302/0.302/0.302/0.000 ms
real 0m10,106s
user 0m0,000s
sys 0m0,002s
From reading issue #22575 of systemd and specifically this comment it seems that a nicer way to handle not supporting AAAA
records would be to return a NOTIMPL
return code. That would in turn make the name resolution take milliseconds and just return the A
answer.
My guess is that one might need to do work upstream in rubydns
to get this working.
NOTE: I'm not suggesting adding support for AAAA
(and hence IPv6) btw. :) Just to respond a little bit nicer! :)
1: nss-resolve
is the resolved
backend for nss
that is in turn used by glibc for getaddrinfo
etc.
I'll try to make some time to look into this..
should be as simple as:
diff --git a/lib/vagrant-dns/service.rb b/lib/vagrant-dns/service.rb
index 47f5b82..b721d2c 100644
--- a/lib/vagrant-dns/service.rb
+++ b/lib/vagrant-dns/service.rb
@@ -35,6 +35,9 @@ module VagrantDNS
match(pattern, Resolv::DNS::Resource::IN::A) do |transaction, match_data|
transaction.respond!(ip, ttl: ttl)
end
+ match(proc { |name, resource_class| resource_class != Resolv::DNS::Resource::IN::A }) do |transaction, match_data|
+ transaction.fail!(:NotImp)
+ end
end
otherwise do |transaction|
EDIT: No, that's not quite it, we still need to match the pattern
So I got to looking at this immediately.
One thing I noticed is that there's a risk for a DNS loop on Linux here.
On Ubuntu and Fedora /etc/resolv.conf
looks something like this:
nameserver 127.0.0.53
options edns0 trust-ad
search <DOMAIN>.<TLD>
127.0.0.53
in turn is then systemd-resolved
. I believe (from looking at the code of Async DNS) that Async::DNS::System.nameservers
ends up being 127.0.0.53
on Ubuntu and Fedora.
Given this code:
otherwise do |transaction|
transaction.passthrough!(std_resolver) do |reply, reply_name|
puts reply
puts reply_name
end
end
... if a query fails (for example for AAAA
to mgmt.test
) then vagrant-dns
will forward the query to systemd-resolved
which will forward it back to vagrant-dns
.
My thinking is that we don't need to forward any requests at all. This will work on Linux since the other DNS servers will be bound to the respective interfaces they are on. Like this:
$ resolvectl
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 127.0.0.153:5300 ← vagrant-dns
DNS Servers 127.0.0.153:5300 ← vagrant-dns
DNS Domain ~test
Link 2 (eno1)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 3 (enp11s0u1u2)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.31.32.101 ← Link specific DNS server
DNS Servers: 172.31.32.100 172.31.32.101 ← Link specific DNS server
DNS Domain: example.com
Link 4 (wlp61s0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 172.31.32.101 ← Link specific DNS server
DNS Servers: 172.31.32.100 172.31.32.101 ← Link specific DNS server
DNS Domain: example.com
Link 6 (virbr0)
Current Scopes: LLMNR/IPv4
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 7 (tap0)
Current Scopes: LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
And I believe that on MacOS, since you just map a TLD to a specific DNS server, you should be fine there as well. Well, unless you're using a tld like .com
and expect to be able to reach my-local-machine.com via vagrant-dns and then all other .com
addresses via passthrough. Is that a setup you want to support?
The reason I'm asking is that this little diff solves the issue with the 10s timeout for me:
diff --git a/lib/vagrant-dns/service.rb b/lib/vagrant-dns/service.rb
index 47f5b82..26c9faf 100644
--- a/lib/vagrant-dns/service.rb
+++ b/lib/vagrant-dns/service.rb
@@ -27,7 +27,6 @@ module VagrantDNS
end
registry = Registry.new(tmp_path).to_hash
- std_resolver = RubyDNS::Resolver.new(Async::DNS::System.nameservers)
ttl = VagrantDNS::Config.ttl
RubyDNS::run_server(VagrantDNS::Config.listen) do
@@ -37,11 +36,12 @@ module VagrantDNS
end
end
+ match(//, Resolv::DNS::Resource::IN::A) do |transaction|
+ transaction.fail!(:NXDomain)
+ end
+
otherwise do |transaction|
- transaction.passthrough!(std_resolver) do |reply, reply_name|
- puts reply
- puts reply_name
- end
+ transaction.fail!(:NotImp)
end
end
end
To add to my already long comment:
My initial thought was that systemd-resolved would continuously try to ask for a AAAA
record from vagrant-dns
and eventually timing out because we didn't send a response.
I now believe that the issue actually was a DNS loop. It makes sense since if the passthrough actually forwarded to the system DNS server instead of back to systemd-resolved
they would get an NXDOMAIN
from there instead.
vagrant-dns allows to hook into public TLDs. And while I have no clue if that is still in use, I wouldn't feel comfortable to remove that feature1.
So here's my proposal:
- we make
resolver
configurable:
false
: Disable passthroughnil
,:system
Use system nameservers (default)[ [proto, ip, port ], ["udp", "1.1.1.1", 53] ]
: list of servers to use
- we match all queries two times, first against
A
for our positive match, than again without class restriction forNOTIMP
Providing a custom upstream DNS server should be helpful in any way.
Non-A-queries for configured patterns will return NOTIMP
, while you can still passthrough everything else.
Footnotes
-
Mind that
.DEV
was used for quite some time, until it became a public TLD, and I've seen people still use that. ↩
released in v2.4.0