Clarification regarding empty non-terminals (RFC 8020)

Question

Clarification regarding empty non-terminals (RFC 8020)

jonasbb opened this issue 4 years ago · comments

Hi,

I was reading your paper "GRooT: Proactive Verification of DNS Configurations" and have a question how the presented RRLookup would react to encountering empty non-terminals (ENTs) (see RFC7719 and RFC 8020).
If this is not a suitable place to discuss this I am happy to discuss this over e-mail too. I searched in the paper and in the repository and couldn't find an answer so far.

Here is in short why I think the paper does not properly model ENTs. I try to expand on each of the points a bit. Aa ENT is a domain name which exists, but has no resource records attached to it. An example might be _tcp.example.com. assuming that _jabber._tcp.example.com. exists and has an SRV record.

The RRLookup as given in Figure 3 does not seem to account for ENTs as the ExactMatch rule does not apply, since there is no exact match. The other rules also do not apply, as a zone can have ENTs without having wildcards, DNAMEs or delegations.
A resource records cannot model an ENT as a record always needs to have a type assigned so there is no way to model an empty record. From the paper:

We model a resource record r ∈ record = ⟨d, t, c, τ, a, b⟩ as a tuple with six components: [...] (2) a record type t ∈ type = {A, AAAA, MX, NS, DNAME, CNAME, SOA, . . .} representing the kind of data the record holds (e.g., AAAA for an IPv6 address)
A well-formed zone (Appendix A) is allowed to have ENTs. No rule prohibits it.

Longer Explanation

Let's assume we have a simple zone like this. I omitted the TTL as it is not important. The parts abbreviated with ... do not matter, except that they should be a valid record of course.

example.com. IN SOA ...
example.com. IN NS ns1.dns.net.
example.com. IN NS ns2.dns.net.

_jabber._tcp.example.com. IN SRV ...

A query like _tcp.example.com. IN SRV should return NOERROR and no data (i.e., ⟨Ans, ∅⟩).
(Btw. you can test this with google.com, which has a _jabber._tcp and _tcp is also an ENT.)

The zone is well-formed according to Appendix A. Rules (1)-(3) hold for the example. (4)-(9),(11) are not applicable since there is neither a CNAME nor a DNAME record. (10) also holds, as the only NS records also have an SOA record.

A zone is a set of records. It does not cover the fact that some domain names exist but have no records.

It is also not possible to create a record which only exists for the domain name, but without associated types or answer, as would be necessary for an ENT. This is a direct result of the definition of a record = ⟨d, t, c, τ, a, b⟩.

The ZoneLookup function will call RRLookup with a set of records. The records for example.com. have a higher rank than the record for _jabber._tcp.example.com., since 𝓘 (Match(r, q) is true for the former, but not the latter. Thus, the maximum lexicographical ordering will result in the records for example.com..

Now only the fallback rule for RRLookup applies. It is no exact match since _tcp.example.com. != example.com., there are no wildcards which could match, no DNAME rewrite, nor any delegations. Thus, RRLookup will return NXDOMAIN.

The NXDOMAIN is a violation of RFC 8020 "NXDOMAIN: There Really Is Nothing Underneath". NXDOMAIN means that the name and everything underneath does not exist. However, _jabber._tcp.example.com. exists and is underneath _tcp.example.com..

ZoneLookup / RRLookup will find the correct answer when presented with a query for _jabber._tcp.example.com.. That is not how all resolvers work though. RFC 7816: QNAME Minimization describes how resolvers instead can query label by label. Thus, such a QNAME minimization resolver would conclude that the answer to _jabber._tcp.example.com. is NXDOMAIN.

I hope my reasoning is clear from the description above and I didn't make a mistake with the formalism.

Siva Kesava R Kakarla · Answer 1 · Thu Aug 20 2020 04:09:46 GMT+0800 (China Standard Time)

Hi, @jonasbb ,

Thanks for bringing this case to our attention. We are glad that a DNS researcher took an interest in the paper and read through it, including the appendix.

As you said, the formal model currently returns NXDOMAIN instead of NOERROR and no data (i.e., ⟨Ans, ∅⟩). A way to fix it is to change the Rank definition and add another case for RRLookup.
Rank would have to be a five-tuple where the first entry 𝓘 (Match(r, q)) has to be replaced by two entries, 𝓘 (ExactMatch(r, q) ) and 𝓘 (WildcardMatch(r, q) ∨ ProperPrefix(r, q)).

ExactMatch(r, q) would be the case when the query and resource record has exactly the same domain name (dn(r) = dn(q)).
WildcardMatch(r, q) would be the case if the query matches the wildcard record (dn(𝑞) ∈∗ dn(𝑟))
ProperPrefix(r, q) would be the case if either the domain name of the query is a proper prefix of the domain name of the resource record or the other way too(dn(r) < dn(q) ∨ dn(q) < dn(r))

RRLookup will have another case with ⟨Ans, ∅⟩ if dn(q) < dn(R).

I believe this would fix the issue and does not alter the answer to other combinations of resource records. Please, let us know if you find any other such scenarios. We will be happy to chat if you are looking to expand the model to cover more types or even DNSSEC. We will also appreciate any feedback if you tried the tool.

Jonas Bushart · Answer 2 · Thu Aug 20 2020 06:39:08 GMT+0800 (China Standard Time)

Hi @SivaKesava1,
that seems to work as far as I checked.

Another option could be to forbid ENT in the well-formedness check for zones. This could also help with a DNSSEC extension as NSEC records have trouble with ENTs since there is nothing they can sign.

I know that some authoritative DNS operators went this route. They create a synthetic TXT record with a message like "ENT" or "RFC 8020". This avoids the issue of ENTs completely and makes DNSSEC easier as now there is a record which can be signed and which can have the appropriate negative proofs too.

If you want to ensure that the answers of the model do not change due to this, you could maybe use a type which can only occur in records but cannot be used in a query. This should be enough for the case without DNSSEC.

Siva Kesava R Kakarla · Answer 3 · Thu Aug 20 2020 06:57:04 GMT+0800 (China Standard Time)

Thanks for checking and for the insights too about DNSSEC.
Using a type that can occur only in records is also a good idea to fix the ENTs case.

Siva Kesava R Kakarla · Answer 4 · Mon Oct 12 2020 11:54:15 GMT+0800 (China Standard Time)

Hi @jonasbb,

We have updated the paper to handle empty non-terminals as per the RFCs and using a type that can occur only in records. The fix that I suggested earlier will create issues for the glue records one, so we decided to use an extra type. The updated paper is available here.