NSD 4.3.7 replicable SIGSEGV 11 on ip6.arpa. lookup
Jamie-Landeg-Jones opened this issue · comments
I've tested this on FreeBSD 11, 12, and 13, and the issue occurs on all three.
NSD 4.3.6 and below are fine - this only happens with 4.3.7.
I've narrowed it down to the following to replicate:
A stock install of nsd 4.3.7 (I've tried FreeBSD ports, FreeBSD packages, AND a manual source compile from the tar file - all are affected)
Add the following to the default nsd.conf:
zone:
name: "ip6.arpa"
request-xfr: AXFR 192.0.32.132 NOKEY # xfr.lax.dns.icann.org.
Restart nsd, and then:
dig -t ns f.9.1.1.0.0.2.ip6.arpa. @127.1
This returns no results, but hangs retrying for a while whist nsd SIGSEVS and relaunches.
The expected result is returned with earlier NSD versions.
Cheers, Jamie
Is this perhaps a duplicate of #189 ? I cannot reproduce it with the current code repository?
Hi. Sorry, I missed #189. It looks similar (lookup under ip6.addr)
I just downloaded the latest from git (sorry, I should have done that first), and it works as it should.
I can try and narrow down the actual commit that fixed it, and/or provide debug core traces, if it would be helpful, but yes, the latest version works. Do you know when a new release will be made?
Thanks, Jamie
So the latest repo works. No you do not need to help find which one it was, likely #174 ; because that touched similar code. It is already fixed. There is no plan for a next release real soon at this time.
712296f only hides the problem, it doesn't fix anything. The real fix is ba0002e
f.9.1.1.0.0.2.ip6.arpa.
is an ENT in ip6.arpa.
and so is 2.ip6.arpa.
In line 1420 in query.c we haveq->delegation_domain = domain_find_ns_rrsets(
and the unfixed domain_find_ns_rrsets
would find the NS RRset for 9.1.1.0.0.2.ip6.arpa.
.
But it would then continue searching upwards, overwriting *ns
which is &q->delegation_rrset
. Until it hits 2.ip6.arpa.
which has no NS records. So q->delegation_rrset = NULL but at the same time result != NULL because we did find a delegation RRset along the way, we just ignored it (at least for 9.1.1.0.0.2.ip6.arpa.
, I didn't check if there was one further up).
domain_find_ns_rrsets
returns non-NULL which means we found a delegation, but at the same time it doesn't give us the delegation NS RRset.
It is probably best to revert 712296f
since on its own it produces wrong results. I.e. adding it to 4.3.7 gives this:
$ dig @192.168.178.219 +norec f.9.1.1.0.0.2.ip6.arpa NS
; <<>> dig 9.10.8-P1 <<>> @192.168.178.219 +norec f.9.1.1.0.0.2.ip6.arpa NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 10923
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;f.9.1.1.0.0.2.ip6.arpa. IN NS
;; AUTHORITY SECTION:
ip6.arpa. 3600 IN SOA b.ip6-servers.arpa. nstld.iana.org. 2021100154 1800 900 604800 3600
;; Query time: 0 msec
;; SERVER: 192.168.178.219#53(192.168.178.219)
;; WHEN: Wed Oct 20 10:24:56 CEST 2021
;; MSG SIZE rcvd: 115
But the correct answer is this:
dig @::1 +norec f.9.1.1.0.0.2.ip6.arpa NS
; <<>> dig 9.10.8-P1 <<>> @::1 +norec f.9.1.1.0.0.2.ip6.arpa NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48090
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;f.9.1.1.0.0.2.ip6.arpa. IN NS
;; AUTHORITY SECTION:
9.1.1.0.0.2.ip6.arpa. 86400 IN NS r.arin.net.
9.1.1.0.0.2.ip6.arpa. 86400 IN NS u.arin.net.
9.1.1.0.0.2.ip6.arpa. 86400 IN NS x.arin.net.
9.1.1.0.0.2.ip6.arpa. 86400 IN NS y.arin.net.
9.1.1.0.0.2.ip6.arpa. 86400 IN NS z.arin.net.
9.1.1.0.0.2.ip6.arpa. 86400 IN NS arin.authdns.ripe.net.
;; Query time: 0 msec
;; SERVER: ::1#53(::1)
;; WHEN: Wed Oct 20 10:24:16 CEST 2021
;; MSG SIZE rcvd: 171
side note: my OS packaging recently updated NSD for some reason and reverted my fixed version I had deployed (my issue) - may be worthwhile to do a point release just to get this in the distribution downstreams, only because it's a crash and at my scale of zones a crash takes a bunch of time to recover from.
This issue should be fixed in 4.3.8.