|
One of the most prominent denial of service attacks in recent months was one that occurred in March 2013 between Cloudflare and Spamhaus. One writeup of this attack can be found here. I’m not sure about the claim that this attack “almost broke the Internet,” but with a peak volume of attack traffic of some 120Gbps, it was a very significant attack nevertheless.
How did the attackers generate such massive volumes of attack traffic? The answer lies in the Domain Name System (DNS). The attackers asked about domain names, and the DNS system answered. Something we all do all of the time of the Internet. So how can a conventional activity of translating a domain name into an IP address be turned into a massive attack?
There are a few aspects of the DNS that make it a readily coercible means of generating an attack;
So the combination of these four factors produces a comprehensive vulnerability for the Internet. In performing experiments about the behaviour of the DNS we see a background level of DNS “probes” that contain a query for “ANY”, often querying the domain ISC.ORG. In this case the original UDP request of 64 bytes generates a UDP response of 3,475 bytes, or an amplification factor of 54. If an attacker can send this UDP query to 100 of these open resolvers each second, using a source address of the intended victim, then the attacker will be generating a query traffic volume of some 51Kbps, and victim will receive an incoming traffic load of 2.78Mbps. If you then enlist a bot army to replicate this simple attack one thousand-fold, then the attack traffic volume has exceeded a gigabit per second. With DNSSEC the response sizes get larger, and the unwitting accomplices in such attacks can expand to include the authoritative name servers of DNSSEC-signed domain.
The problem is that, in flight, the traffic looks like all other traffic. These are simple DNS responses, and the network carries DNS responses as one of the more common packet types. So simple filtering is simply not an option, and we need to look deeper to see how we could mitigate this rather worrisome vulnerability. This has been a long-standing issue. (For example, this was a presentation from May 2006 on this topic at an IETF meeting. The only difference is that the number of open resolvers has subsequently jumped from 500,000 to in excess of 25 million!)
There have been a number of threads of discussion on this topic.
One thread of conversation goes along the lines that if everyone implemented the measures described in BCP38, this would prevent endpoints from emitting DNS query packets with a false source address, and thereby preventing these reflection attacks to be mounted using the DNS. Of course this is a long standing conversation that is older than BCP38 itself. BCP38 is now 13 years old as a document, and the somewhat worrisome observation is that nothing much appears to have happened in terms of improving the situation about source address spoofing over the past 13 years, so there is not a lot of optimism that anything will change in the coming months, if not years.
There is a conversation thread that says that resolvers should implement response rate limiting (RRL), and silently discard repetitive queries that exceed some locally configured threshold. While this is relatively effective in the case of authoritative name servers, it is less effective in the face of a massive pool of open recursive resolvers, as in this latter case the query load can be spread across the entire pool so that each resolver may not experience a detectable level of repeated queries. It is also possible to use a very pool of authoritative name servers in the same manner. However, this consideration does not weaken the advice that authoritative name servers should implement RRL in any case. (There is an informative presentation by Randy Bush at the 2013 APRICOT conference, illustrating the before and after state of the application of RRL for an authoritative name server.)
Another thread of conversation is to shutdown the open recursive resolvers. The open resolver project is working on a “name and shame” approach to the problem, and is hopeful that by allowing individual resolver operators to check their own status, that these resolvers will be closed down. Like the universal adoption of BCP38, it’s hard to be overly optimistic about this approach. Part of the issue is that there is a large volume of unmanaged or semi-managed systems connected to the Internet, and the vulnerabilities they create are not necessarily known to the parties who deployed these systems in the first place. For example, one way to create more robust “plug and play” systems is to reduce the amount of environmental configuration that needs to be loaded into such systems in the first place. Equipping such standalone units with their own DNS resolver appears to be a way to remove an additional configuration element from the unit. The downside is that if these units are configured as an open recursive resolver, then they become part of the overall problem of the massive population of open resolvers on today’s Internet.
The behaviour of the DNS where a small-sized query generates a large response is one that appears to be intrinsic to the DNS, and particularly DNSSEC. If we want some form of security in the DNS so that the client can be assured that the response they receive is authentic and current, and has not been tampered with in any way, then the overheads of cryptographic signature blocks and the size these signature blocks take up appears to be an intrinsic part of the DNS’ security architecture. So we want a lightweight fast DNS, so we appear like using UDP, coupled with an ubiquitous distribution of recursive resolvers and the associated resolver caches, but we also want DNS responses that can be verified, so we like DNSSEC. So the responses get larger, and the outcome is that the DNS is a highly effective platform for massive traffic attacks where there is a very limited space to mitigate the associated risks.
If we want to close the door on using the DNS to mount large scale attacks, there does not appear to be much space left to manoeuver here.
However, there is a conversation that has not quite petered out yet. That conversation is about the use of UDP in the DNS.
The original specification of the name system (RFC1123) allowed for the use of both UDP and TCP as the transport service for DNS queries and responses. The relevant part of this specification reads:
”... it is also clear that some new DNS record types defined in the
future will contain information exceeding the 512 byte limit that
applies to UDP, and hence will require TCP. Thus, resolvers and
name servers should implement TCP services as a backup to UDP
today, with the knowledge that they will require the TCP service
in the future.”Why 512 bytes? The 512 byte limit was based on the IPv4 Host Requirement Specification (RFC1122), where it is required that all IPv4 systems must accept an IP packet that is at least 576 octets in size. Allowing for 20 bytes of IP header, 8 bytes of UDP headers and room for the maximum of 40 bytes of IP options, this implies that the maximum payload in a UDP packet that will be accepted by all IPv4 hosts is restricted to 512 bytes.
It should be noted that it is theoretically possible that there are links that do not support IP packets of 576 bytes in size, so even these 576 byte IP packets may possibly be fragmented in flight. The limit being referred to here is the largest (possibly reassembled) packet that a host will assuredly accept.
Now of course it is possible to generate larger packets in IPv4, to a theoretical maximum of 65,535 bytes in size, which accommodates a UDP payload of 65,507 bytes, but such larger packets will probably be fragmented in flight. In such a case, when this is combined with typical firewall behaviour, then the trailing packet fragments may not be passed onward to a client at all, as many firewall configurations use acceptance rules based on UDP and TCP port address. As the trailing fragments of a fragmented IP datagram have no UDP or TCP packet header, this leaves with the firewall with a quandary. Should the firewall accept all IP fragments? In this case the security role of the firewall may be compromised through the transmission of fragments that form part of some form of hostile attack. Or should the firewall discard all IP fragments? In this case a sender of a large packet should be aware that if there is fragmentation, then the trailing packet fragments may not be passed to the receiver. So with the two considerations that hosts are not required to accept UDP datagrams that are larger than 576 bytes in size, and firewalls may discard trailing fragments of an IP datagram, the original DNS response to this situation was to limit all of its UDP responses to 512 bytes in size, and always use TCP as the backup plan if the DNS response was larger than 512 bytes in size.
However, clients may not know in advance that a DNS response is larger than 512 bytes in size. To signal to a client that it should use TCP to retrieve the complete DNS response, the DNS resolver will respond in UDP with a partial response, and set the “truncated” bit in the response.
We continued for some years with this approach. The DNS used UDP for the bulk of its transactions, which all fitted within the 512 byte limit, and for the relatively infrequent case where larger DNS responses were being generated, the UDP response was truncated, and the client was expected to repeat the question over a TCP connection.
This was not an altogether comfortable situation when we then considered adding security credentials to the DNS. The inclusion of digital signatures in DNS responses implied that very few DNSSEC responses would fit within this 512 byte limit. But if the process of switching to TCP was to respond to the UDP query with a UDP response that essentially said “Please use TCP” then this adds a considerable delay to the DNS function. Each query would now involve a minimum of 3 round-trip time interactions with the server rather than just the single round trip time interval for UDP (a TCP transaction still includes one transaction for the UDP query and truncated UDP response, one for the TCP connection establishment handshake, and one for the TCP query and response). The next refinement of the DNS was to include a way to signal that a client was able to handle larger DNS responses in UDP.
As pointed out in RFC5966:
“Since the original core specifications for DNS were written, the
Extension Mechanisms for DNS (EDNS0 [RFC2671]) have been introduced.
These extensions can be used to indicate that the client is prepared
to receive UDP responses larger than 512 bytes. An EDNS0-compatible
server receiving a request from an EDNS0-compatible client may send
UDP packets up to that client’s announced buffer size without
truncation.”
EDNS0 allows for the UDP-based DNS response to grow to far higher sizes. The result of this extension is that it appears that the DNS largely uses UDP and EDNS0 is used to allow for the larger sized responses. TCP is now often regarded as a somewhat esoteric option, only being used for zone transfer operations, and if the zone does not want to support zone transfers as a matter of local policy, then it is commonly thought that the role of TCP is no longer essential.
Section 6.1.3.2 of the Host Requirements Specification that addresses requirements for applications and support (RFC1123) states:
DNS resolvers and recursive servers MUST support UDP, and SHOULD
support TCP, for sending (non-zone-transfer) queries.
In the world of standards specifications and so-called normative language, that “SHOULD” in the above text is different to a “MUST.” It’s a little stronger than saying “well, you can do that if you want to”, but it’s a little weaker than saying “You really have to do this. There is no option not to.” Little wonder than some implementors of DNS resolvers and some folk who configure firewalls came to the conclusion that DNS over TCP was an optional part of the DNS specification that need not necessarily be supported.
But DNS over TCP is not only a tool to allow for large DNS responses. If we review the preconditions for the use of the DNS in large scale reflector attacks, the widespread support of UDP for large packet responses, and the relatively sparse use of BCP38, allows an attacker to mount a reflection attack by co-opting a large set of open resolvers to send their responses to the target system, by using UDP queries whose IP source address is the IP address of the intended victim.
TCP does not have the same vulnerability. If an attacker were to attempt to open up a TCP session using an IP source address of the intended victim, the victim would receive a short IP packet (IP and TCP header only, which is a 40 byte packet) containing only the SYN and ACK flags set. As the victim system has no pre-existing state for this TCP connection, it will discard the packet. Depending on the local configuration, it may send a TCP RESET to the other end to indicate that it has no state, or the discard may be completely silent. This removes one of the essential preconditions for a reflector amplification attack. If the attack traffic with the spoofed source address of the intended victim uses a 40 byte SYN TCP packet then the victim will receive a 40 byte SYN/ACK TCP packet. The DNS attack amplification factor would be effectively removed.
If the DNS represents such a significant vulnerability for the Internet through these UDP-based reflection attacks, then does TCP represent a potential mitigation? Could we realistically contemplate moving away from the ubiquitous use of ENDS0 to support large DNS responses in UDP, and instead use DNS name servers that limit the maximal size of their UDP responses, and turn to TCP for larger responses?
Again, lets turn to RFC5966:
“The majority of DNS server operators already support TCP and the
default configuration for most software implementations is to support
TCP. The primary audience for this document is those implementors
whose failure to support TCP restricts interoperability and limits
deployment of new DNS features.”
The question we are looking at here is, can we quantify the extent to which DNS resolvers are capable of using TCP for DNS queries and responses? How big is this “majority of DNS server operators” being referred to above?
The Experiment
We conducted an experiment using a modified DNS name server, where the maximal UDP packet size, was configured to 512 bytes, and then set up an experiment where a simple query to resolve a DNS name would generate a response what could not fit within 512 bytes.
While this is relatively easy if the query includes DNSSEC, if we want to set up a condition where all DNS responses are larger than 512 bytes for a domain then we need to use a slightly different approach. The approach used in this iteration of the experiment is to use the DNS name alias function, the CNAME record.
Here is a sample zone:
$TTL 1h
@ IN SOA nsx.dotnxdomain.net. research.apnic.net. (
2013011406 ; Serial
3600 ; Refresh
900 ; Retry
1 ; Expire
1 ) ; Minimum
IN NS nsz1.z.dotnxdomain.net.
z1 IN A 199.102.79.186
* IN A 199.102.79.1864a9c317f.4f1e706a.6567c55c.0be33b7b.2b51341.a35a853f.59c4df1d.3b069e4e.87ea53bc.
2b4cfb4f.987d5318.fc0f8f61.3cbe5065.8d9a9ec4.1ddfa1c2.4fee4676.1ffb7fcc.ace02a11.
a3277bf4.2252b9ed.9b15950d.db03a738.dde1f863.3b0bf729 IN CNAME 33d23a33.3b7acf35.
9bd5b553.3ad4aa35.09207c36.a095a7ae.1dc33700.103ad556.3a564678.16395067.
a12ec545.6183d935.c68cebfb.41a4008e.4f291b87.479c6f9e.5ea48f86.7d1187f1.7572d59a.
9d7d4ac3.06b70413.1706f018.0754fa29.9d24b07c33d23a33.3b7acf35.9bd5b553.3ad4aa35.09207c36.a095a7ae.1dc33700.103ad556.3a564678.
16395067.a12ec545.6183d935.c68cebfb.41a4008e.4f291b87.479c6f9e.5ea48f86.7d1187f1.
7572d59a.9d7d4ac3.06b70413.1706f018.0754fa29.9d24b07c IN A 199.102.79.187
The use of the combination of long label strings and a CNAME construct forces a large response, which, in turn, triggers DNS to send a truncated UDP response in response to a conventional query for the address record for the original domain name. The truncated UDP response should force the client resolver to open a TCP session with the name server, and ask the same query again.
To ensure that the authoritative name server directly processed every name query we used unique labels for each presented experiment, and ensured that the DNSSEC singed zones were also unique. This ensures that local DNS resolver caches cannot respond to these name queries.
The first question we are interested is: How many clients were able to successfully switch to TCP following the receipt of a truncated response in UDP?
We conducted an experiment by embedding a number of test cases inside an online ad, and over 8 days at the end of July 2013 we presented these tests to 2,045,287 end clients. We used four experiments. Two experiments used a name where the query and response would fit within a 512 byte payload, as long as the query did not include a request for DNSSEC. One of these domain names was unsigned, while the other was signed. The other two experiments using the CNAME approach to ensure that the response would be larger than 512 bytes. Again, one zone was signed, while the other was unsigned.
Experiment | UDP queries | Truncated UDP responses | TCP responses | Truncated UDP to TCP fail |
Short, unsigned | 2,029,725 | 2 | 6 | 0 |
Short, signed | 2,037,563 | 1,699,935 (83.4%) | 1,660,754 (81.5%) | 39,101 (1.9%) |
Long, unsigned | 2,023,205 | 2,021,212 (99.9%) | 1,968,927 (97.3%) | 52,285 (2.6%) |
Long, signed | 2,033,535 | 2,032,176 (99.9%) | 1,978,396 (97.3%) | 53,780 (2.6%) |
This data appears to point to a level of failure to followup from a truncated UDP response to a TCP connection of some 2.6% of clients.
That level of failure to switch from a truncated UDP response to re-phrase the query over TCP is large enough to be significant. The first question is this failure due to some failure of the DNS authoritative name server or a failure of the client resolver? If the name server is experiencing a high TCP session load it will reject new TCP sessions. The way it does this is to respond to incoming TCP session establishment packets with a RESET. We saw no evidence of this session overload behaviour in the packet traces that were gathered by the authoritative name server. So the TCP failure looks to be some issue closer to the client resolver than to the authoritative name server.
We can also look at this from the perspective of the set of visible resolvers. How many resolvers will switch to use TCP when they receive a UDP response that is truncated? Before looking at the results, it needs to be noted that the only resolvers that are exposed in this experiment are those resolvers that query our authoritative name server (visible resolvers). If a resolver is configured to use a recursive resolver, then its behaviour will not be directly exposed in this experiment. It should also be noted that even when the visible recursive resolver is forced to use TCP to query the authoritative name server, the recursive resolver may still relay the response back to its client via UDP, using EDNS0 to ensure that the larger UDP response is acceptable to the client.
The 2 million clients used a total of 80,505 visible resolvers. Some 13,483 resolvers, or 17% of these visible resolvers did not generate any TCP transactions with the authoritative name server. These 13,483 UDP-only resolvers made a total of 4,446,670 queries, and of these some 4,269,495 responses were truncated, yet none of these resolvers switched to TCP.
There is a second class of filtering middleware that operates on incoming traffic. In such cases the authoritative server will see an incoming TCP SYN packet to establish the DNS connection, and the server will response with a SYN+ACK packet. Because this packet will be blocked by the filtering middleware, it will never get passed through to the resolver client, and the TCP connection will not be established. This SYN-only behaviour was observed in just 337 resolvers, which represents some 0.4% of the set of visible resolvers. These resolvers generated a total of 1,719,945 queries, and received 1,575,328 truncated UDP responses.
Why is the client-level TCP failure rate at some 2% - 3% of clients, while at the resolver level the TCP failure rate is some 17% of visible resolvers? There are at least three possible reasons for this.
Firstly, in some cases we observe service providers using DNS forwarder farms, where queries are spread across a number of query engines. When a DNS query is re-phrased using TCP, it may not use the same forwarder to make the query.
Secondly, we should factor in end client failover to another DNS resolver that can support DNS transactions over TCP. Most clients are configured with multiple resolvers, and when one resolver fails to provide a response the client will ask the query of the second and subsequent resolvers in its resolver set. If any of the visible resolvers associated with the resolvers listed in the client’s resolver set are capable of using TCP, then at some stage we will see a TCP transaction at the authoritative name server. In this more prevalent case of TCP failure, either the resolver itself is not capable of generating a DNS query using TCP (due presumably to local limitations in the resolver software or local configuration settings), or some network middleware is preventing the resolver performing outgoing TCP connections to port 53.
Thirdly, the distribution of end clients across the set of visible resolvers is not even, and while some resolvers, such as the set used by Google’s Public DNS service serve some 7% of all end clients, others serve a single end client. We observed that some 53,000 experiments, out of a total of some 2 million experiments, failing to complete a TCP-based DNS resolution, so it is also possible that these 13,483 visible resolvers that do not support TCP queries are entirely consistent in volume with this level of end client failure to resolve the experiment’s DNS label.
There is a slightly different way to look at this question. While we saw some 53,000 experiments that failed to complete the DNS resolution at all, how many experiments were affected by this deliberate effort to force resolvers to use TCP? How many clients were impacted in terms of longer DNS resolution time through the use of DNS resolvers that fail to switch to use TCP?
Experiment | UDP queries | Truncated UDP responses | TCP responses | Truncated UDP to TCP fail | Used TCP-Fail Resolvers |
Long, unsigned | 2,023,205 | 2,021,212 (99.9%) | 1,968,927 (97.3%) | 52,285 (2.6%) | 124,881 (6.1%) |
Long, signed | 2,033,535 | 2,032,176 (99.9%) | 1,978,396(97.3%) | 53,780 (2.6%) | 129,555 (6.4%) |
This table shows that slightly more than 6% of all clients used a resolver that was unable to repeat the DNS query over TCP. Some 75,000 clients used an alternate resolver that was capable of performing the TCP query, while the remainder were unable to resolve the DNS name at all.
After running this initial experiment, we considered the use of the CNAME construct to inflate the DNS response to more that 512 bytes, and wondered if this additional DNS indirection created some problems for some resolver clients. Another approach to coerce client resolvers to use TCP is to modify the name server code and drop its UDP maximum size to 275 octets, so that the name server will truncate the UDP response for any response of 276 bytes or larger. In this way a DNS query for the short unsigned name would fit within the new UDP limit, but in all other cases the UDP response would be truncated.
The results we saw for this second experiment, which removed the CNAME entry, and used an authoritative name server with a 275 byte UDP payload limit, with three days of collected data, are summarized in the following table.
Experiment | UDP queries | Truncated UDP responses | TCP responses | Truncated UDP to TCP fail |
Short, unsigned | 704,458 | 0 | 3 | 0 |
Short, signed | 706,975 | 581,081 (82.2%) | 568,704 (80.4%) | 12,377 (1.8%) |
Long, unsigned | 703,384 | 571,182 (81.3%) | 556,835 (79.2%) | 14,977 (2.1%) |
Long, signed | 706,553 | 579,639 (82.0%) | 564,716 (79.9%) | 14,923 (2.1%) |
These results are largely consistent with the results of the original experiment. The failure rate is slightly lower, and this may lead to one further possible reason why a DNS resolver will fail to re-phrase the query in TCP following making a query in UDP. DNS Resolvers may be configured to operate within the parameters of a maximal elapsed time to resolve a DNS name, and once this time has elapsed they may simply abort their efforts to resolve a DNS name. With the CNAME construct we have seen some resolvers perform an additional query. When the query the original DNS name, the name server returns the CNAME record and the A record for the target of the CNAME. A small proportion of resolvers appear not to use the supplied A record in the response, and take the CNAME target name and launch a second query for the A record of this CNAME target name. The increase in elapsed time to perform another DNS query may have some bearing on the triggering of the resolver client’s maximum resolution timer. It appears that the number of clients who would be impacted by a forced switch to TCP is of the order of 2% of clients.
Conclusion
The original specification of the DNS called for resolvers to use UDP when the response was 512 bytes or smaller, and TCP was to be used for larger DNS transactions. DNS clients would interpret the truncated flag in a DNS UDP response to trigger a re-query using TCP.
With the introduction of EDNS0, clients can now signal their capability to accept larger UDP datagrams, with the result that the fallback to TCP for large DNS responses is used less frequently, to the extent that there is now a concern that a significant set of clients cannot resolve a DNS name if that resolution operation is required to occur using TCP.
However, DNS UDP is being used in various forms of malicious attack, using DNS queries where the response is far larger than the query. The combination of source address spoofing and DNS over UDP is presenting us with some significant issues. For that reason there is a renewed consideration of the viability of reverting to TCP for various forms of larger DNS responses, which effectively prevents source address spoofing in the DNS query / response interaction.
In this experiment we’ve looked at the impact a forced switch to DNS over TCP would have on clients. In particular, what proportion of clients would no longer be able to complete a DNS name resolution process if the process necessarily involved the use of TCP? Our measurements of a sample of 2 million clients in early August 2013 points to a DNS resolution failure rate for some 2% of clients.
The picture for individual DNS resolvers appears to be somewhat worse, in that some 17% of visible resolvers do not successfully followup with a TCP connection following the reception of a truncated UDP response.
While that 17% number is surprisingly high, there are two mitigating factors here.
It appears that clients make use of multiple DNS resolvers in their local DNS configuration, so that failure of an initially selected resolver to respond to a query due to lack of support for TCP may be resolved by the clients selecting the next resolver from their local resolver set. For this set of clients, which appears to encompass some 4% of the total client population, the penalty is increased DNS resolution time, where the resolution of a name requires the client to failover to the other resolvers listed in their local DNS resolver set.
Secondly, the more intensively used visible DNS resolvers appear to be perfectly capable of supporting TCP-based queries, so the issues with TCP support in the DNS appear to be predominately concerned with resolvers that are used by a relatively small pool of end clients.
Sponsored byDNIB.com
Sponsored byIPv4.Global
Sponsored byVerisign
Sponsored byRadix
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byVerisign
One issue with TCP: overhead. There’s significantly more overhead in handling TCP connections than UDP packets, especially on the nameserver side. Due to the number of clients it’s probably infeasible to have persistent connections between clients and authoritative nameservers, the server would run out of sockets. Without persistent connections, TCP connection setup is going to be a significant load on the server if everyone’s using TCP. Even locally (eg. between home computers and ISP nameservers) there’s probably too many clients to keep connections open. Connection overhead’s something that doesn’t show up much in a small-scale test such as what you did, there’s a major difference in the number of queries between a single ad placement and 90+% of all Web content including 90+% of all ads.
What I’d be interested in seeing is a test using a single day’s worth of actual queries to one nameserver for a popular domain, replicating that exact same set of queries (timing and all) using TCP instead of UDP, to see what the impact on the nameserver would be. I’m guessing it’ll be… non-negligible.
(Meant as a reply to comment #1 on TCP overhead, not to the article itself) What matters most of all is the change to the end client's experience, not the impact on the internal server's workload. Yes, an increased workload by the internal servers will likely be felt by the end-clients, but optimizing for the internal servers efficiency isn't the point. As it is, internal servers are efficiently (in the sense of not dealing with TCP overhead) dishing off abusive traffic to victims at the behest of sources falsifying the return address. If TCP means that the internal server now has to work harder on the requests but we stop the flood of abusive traffic, I think that's a trade the industry would like to make. But the problem is so simple. Switching to TCP isn't just requiring more work on the internal server. And the above trade is probably not so simply stated either. It's not clear that switching over to TCP (in any sense) will cause the abusive traffic to cease - you can bet the demand to launch the abusive traffic remains - but the means of the abuse will move somewhere else. IMHO, I think that having TCP play a more significant role in DNS traffic has potential to improve the state of DNS operations. But this isn't a certainty - "potential" is the operative word in there. Studies like Geoff's help to let us know where we are today. Measuring "will it disrupt" is more important than studying the extra workload - because we already know there will be more workload, we just don't know if the extra matters, we don't know that it is the bottleneck.
My thought was that to curb abusive traffic DNS would have to switch completely to TCP. If UDP's still supported at all, the amplification attack still works. TCP does, however, successfully curb the amplification attack. It's much harder to blindly falsify a TCP connection stream in a way that gets the server to send a large amount of traffic to an arbitrary victim, and it's easier to filter bogus TCP connection traffic further upstream where there's less of a bandwidth bottleneck. I don't think you can divorce the end-client experience from the authoritative nameserver's performance. If a switch to TCP causes serious degradation on the authoritative nameservers for domains, that translates directly into end-client problems resolving names. Too much degradation and TCP becomes infeasible. I suspect the limiting factor would be network sockets rather than system performance.
If one were to enforce a rule that DNS traffic had to be over TCP, what would become harder is the reflection nature, not the amplification factor. But amplification isn't all that useful if it isn't directed at another element (the victim). Switching to TCP also does not achieve "curbing abusive traffic" but it goes a long way to making reflection harder. Making reflection harder is very useful because it enables other strategies to be used. E.g., the inherent amplification factor in negative answers (preDNSSEC), DNSSEC, and other places is again much less of a problem once reflection is removed from the game. You are correct in saying you cannot divorce the end-client experience from the authoritative name server's performance." That's very true. However, this isn't about comparing a fast running server using UDP to a slower running server using TCP. The comparison is between an abused server using UDP versus a slower server using TCP. If the abuse is bad enough (and one can argue it is), the former is a worse experience than the latter. That is why I'm saying that optimizing for the name server efficiency UDPvs.TCP is incorrect, optimizing for the end-client is the goal.
Right. Currently however the reflection attack causes problems for the target, but everyone not the target (or on the target's network) is relatively unaffected until the attack gets so bad that the load starts to degrade the nameserver itself. If TCP causes degradation that affects everyone quicker, that's arguably as bad a problem.
True, but the popularity of reflection attacks means that the operators that are turned int unwilling accomplices by reflection are seeing enough volume (in these attacks) to cause problems for others, the unintended victims. This is what launched me into calling for studies in the use of TCP, knowing full well TCP wouldn't be easy medicine to swallow. A presentation given in May (2013) at DNS-OARC: https://indico.dns-oarc.net/indico/contributionDisplay.py?contribId=18&confId=0
Yes. That's why I'd like to see what performance is like using TCP for the actual volume of queries a nameserver receives. If it can keep up, then all the other questions come into play. If it can't, though, then we have to start looking at whether additional nameservers will just have to be deployed or whether TCP just isn't feasible. I'd prefer to find the answer early on rather than after we've done all the work and are ready to go to production.
Very thorough article. I like you you covered the history of the RFC’s. What if we were to run a large scale DNS TCP day and get some large ISP’s to endorse ? Similar to the IPv6 day initiatives.
TCP has the disadvantage that, unlike UDP, it requires state maintenance at the server side. Given the rate at which DNS servers can dish out results, this burden could be quite substantial.
My PhD thesis addressed reflector attacks (and other malicious abuses of protocols) from a protocol design perspective, and in section 7.3 I make suggestions for ways to deal with reflector attacks (and other attacks that rely on IP address forgery). One approach, already well known and mentioned in comments here, is to enforce source address validity at routing points.
A more novel suggestion is to extend IP or introduce a protocol layer between IP and TCP/UDP which performs host-to-host address verification. I refer to this as “Verified Internet Protocol” (VIP). It uses a simple cookie exchange to prove that the end-points can actually receive traffic addressed to their alleged source address. This involves an extra round trip, but the cost is only paid once per IP-address-pair for as long as the cookie is cached, whereas TCP effectively has the same cost on a per-connection basis.
DNS over UDP over VIP would be an effective countermeasure to the reflector problem, and would be further enhanced through complementary measures such as source address filtering. The drawback, of course, is that it’s not a viable short term solution, as it requires a new protocol to be developed and deployed, even if that protocol is relatively trivial. Still, I hope it’s worth working towards long-term solutions such as this. There are numerous reasons why verified source addresses are a good thing, beyond their ability to ameliorate reflector attacks.
You probably can't maintain the cookie on an IP-address-pair basis. If you did, a single DNS query would render that host vulnerable to attack since the forged packets would match a valid cookie and no check would be done. You'd have to maintain it on a per-endpoint (source and destination address and port) basis with a relatively short time-out, and overhead then wouldn't be much less than for TCP (the verification would be approximately the TCP handshake sequence).
I think your analysis is based on incorrect assumptions about the protocol, but I'd rather not get into a technical argument about it here. If anyone wants to discuss it, feel free to contact me ([email protected]).