Making the DNS more private with QNAME minimisation
Minimising the query name
DNS queries can contain a lot of sensitive information about internet users. Query name minimisation (qmin) limits the information revealed to that which is actually necessary for a DNS name server to answer the query. We have done an initial study of qmin deployment and the associated challenges, and published our research at the Passive and Active Measurement (PAM) conference. In this post, we summarise the highlights.
Adoption of qmin is increasing slowly but steadily
Implementation of qmin differs, depending on which resolver software is used
Qmin can slightly increase the query load and lookups can fail in a few cases
So how are DNS queries privacy sensitive?
When an application (e.g. a browser or an e-mail client) uses the DNS to look up the IP address of a domain name, it uses the domain’s full name. As a result, if you point your browser to www.netflix.com, that reveals to the DNS that you are likely interested in watching films. Similarly, a query to www.nos.nl reveals that you are probably interested in the latest news from the Dutch public broadcasting company. While those queries are probably not very privacy sensitive, a query for some-bad-disease.example.nl has the potential to reveal sensitive medical information.
As the DNS operator for the .nl TLD, SIDN receives more than a billion DNS queries a day for all sorts of different domain names. Our name servers don’t know the IP address of either example.nl or some-bad-disease.example.nl; they merely refer the querying resolver to the name servers for example.nl, disregarding the some-bad-disease. A resolver could therefore simply ask us for the name servers for example.nl, without revealing the full query name (some-bad-disease.example.nl). What we see in practice, however, is that more than 50 per cent of queries to the .nl name servers contain more information than necessary. The reasons for that are largely historical.
Minimising the query name
Query name minimisation (qmin), an IETF standard introduced in 2016 (RFC 7816), describes how recursive resolvers should limit the information revealed to the minimum. So, in the example above, a resolver should query us only for example.nl and only for its name servers, not for the IP address.
While that sounds simple enough, the concept of qmin is a little more complicated in practice. The main reason is that delegations can exist at any level. For example, a domain a.b.c.d.e.example.com might have name server delegations at any (or none) of the subdomains. The cut-off point where authority over one part of the domain ends, and another name server’s authority begins, is known as a zone cut. Consequently, qmin implementations theoretically have to query for the domain iteratively, increasing the length of the domain by one step each time, until a delegation is encountered.
Because the standard is more than two years old, and because some resolver software already have qmin implementations, we were curious to know how far qmin deployment has progressed on the internet.
We detected qmin support by utilising the fact that a non-qmin resolver will miss any delegation that happens in one of the labels before the terminal label. So, if we delegate to a different name server, with a different record for the terminal label in one of the labels before the terminal label, qmin resolvers will obtain a different answer from that obtained by non-qmin resolvers.
We therefore scheduled a series of longitudinal measurements using the RIPE Atlas measurement network. We performed a lookup with every resolver in the RIPE network by repeating queries for a.b.qnamemin-test.domain.example every hour, where the domain name in question was under our control. We were able to detect a non-qmin resolver because such a resolver will send a query for the full qname to the authoritative name server for qnamemin-test.domain.example, and will end up with a TXT reply containing the text: “qmin NOT enabled”. A qmin resolver will send a query for just the second-to-last label, b.qnamemin-test.domain.example, to the authoritative name server for qnamemin-test.domain.example. For that minimised query, it will receive a delegation to a different name server, which will return a TXT record containing the text: “qmin enabled.' In total we tested over 18,000 resolvers.
The figure below shows the adoption of qmin since April 2017. In our study period, adoption grew from 0.7 per cent to 16.8 per cent. The huge increase in April 2018 was mostly due to RIPE Atlas probes that rely on Cloudflare’s open resolver, which supports qmin by default. Updated numbers are published daily here.
As well as using RIPE Atlas, we measured the adoption of qmin by checking open resolvers and the .nl name servers. First, we queried a list of 1.2 million open resolvers which map to 110,000 unique IPs, observed by our authoritative name servers. Out of those, only 1.6 per cent were found to support qmin.
Within .nl, the situation looks slightly better. Since June 2017, the number of queried second-level domain names has increased from around 33 per cent to 44 per cent today. We assume that a resolver supports qmin if it sends mostly queries for the two rightmost labels of a domain name (example.nl in some-bad-disease.example.nl). You can find daily updated statistics on our statistics website stats.sidnlabs.nl.
According to the standard, a resolver that implements qmin should iteratively increase the name length by one label, querying for the NS type, until it reaches the full name. Then it should switch to the original query type. In practice, however, we see that qmin is implemented in many different ways, but never quite as described in the standard.
We used RIPE Atlas again and let each probe’s resolver query a test domain with 24 labels under our control. By collecting the queries directly from the authoritative name server, we see that resolvers that implement qmin do not send one query for each single label, but usually send a query for the first few labels and then skip to the last label of the query name. Knot and BIND start off by sending NS queries, but Unbound queries for A records only. Resolvers adopt such strategies to reduce the query load and avoid getting stuck in misconfigured zones. The following table shows how many resolvers we observed exhibiting the different behaviours.
Finally, we were curious to know whether qmin has any impact on performance or query success rate. We set up a testbed, with resolvers running Unbound (version 1.8.0), Knot (3.0.0) and BIND (9.13.3), and queried the million most popular domain names, as listed in the Cisco Umbrella top list. Unbound and BIND both support strict and relaxed versions of qmin, so we tested both versions in each case. In strict mode, resolvers do not fall back to sending the full QNAME to potentially broken name servers. We started our measurements with an empty cache.
Depending on the implementation, we measured an increase in queries of between 19 per cent in Unbound and 26 per cent in BIND. The actual increase with production resolvers is probably lower, because caching will reduce the overhead.
We also noticed that we were unable to resolve some of the domain names when running the resolvers with the strictest implementations of qmin: up to 3 per cent with Unbound and up to 5 per cent with BIND. Fortunately, however, we did not measure any increase when Unbound was run in relaxed mode and only a 0.5 per cent increase with BIND in relaxed mode. The table below summarises our measurement results. Note that the relatively high number of errors, even with qmin disabled, is a property of the umbrella top list.
Measure qmin yourself
You can test whether your resolver supports qmin by querying the domain below, using the command line tool dig, which relies on the same technique:
dig a.b.qnamemin-test.internet.nl TXT
Conclusion: the DNS is becoming more private
Our results show that qmin is on the rise, which we believe is a good thing. The slight decrease in performance is an acceptable trade-off for an increase in privacy for internet users and the performance penalty in production is probably lower than observed in our tests.
You can find more details about the measurements in our paper, which was produced in collaboration with Quirin Scheitle (Technical University of Munich), as well as Willem Toorop and Ralph Dolmans (both NLnet Labs) and Roland van Rijswijk-Deij (NLnet Labs and University of Twente). We have also made our award winning data sets public.