C.2. Inferring Internet Denial-of-Service Activity


The paper described here was written by David Moore (of CAIDA) and Geoffrey Voelker and Stefan Savage (of the Department of Computer Science and Engineering at the University of California at San Diego). It was published in the 2001 USENIX Security Symposium.

The authors used a clever insight to get an overall picture of the amount and characteristics of DoS activity in the Internet as a whole.[1] They observed that many (though not all) DDoS attacks use general IP spoofing to hide the sources of the attack. An IP address is chosen more or less at random from the range of legal addresses and used as the putative source address of an attack packet. There are a vast number of attack packets, so sooner or later each possible IP address is used in some attack packet.

[1] In actuality, various researchers (including Dave Dittrich and SvenDeitrich, two of this book's authors) had been using this technique for some time to observe DDoS activity [DLD00], but, for various reasons, they were unable to publish data from their studies. See also http://seclists.org/incidents/2000 /Apr/0026.html.

Further, many packet types used in typical attacks will generate a response, assuming they actually get delivered to the target of the attack. The response might merely be a packet indicating an error, but a response packet usually will get returned to the supposed sender of the attack packet. Since spoofing was used, the response does not go to the machine that sent the attack packet, but to the machine whose address was spoofed in the attack packet's source address field.

The authors of this paper realized that if one set up a network of machines that had no real users or services, and should never receive any legitimate traffic at all, any packets that it did receive would be part of some kind of attack. Some of them would be part of a DDoS attack, and, since those packets would be recognizable as response packets of various sorts (unlike the kinds of packets generated by random port scanning or worms), they could be separated out from the remainder of the traffic. They could then provide insight into which machines in the Internet were currently under attack, and perhaps insight into the character, duration, and size of the attack. The authors called the packets arriving at their test network due to responses to DDoS packets backscatter. The overall technique of inferring DDoS activity based on these packets and their characteristics is often called the backscatter technique.

There are a number of caveats that the authors themselves bring up about interpreting these results. Perhaps the three most important are that the results do not capture data on attacks that did not use generally randomized IP spoofing, that attack packets that would not tend to generate responses are not represented in the data, and that congestion and other effects certainly caused the dropping of an unknown number of attack packets and responses to those packets. All of these caveats suggest that the numbers reported here were underestimations of the actual DDoS activity, though it is impossible to know by how much. Despite these shortcomings, this study represents the best data we have available on overall prevalence of DDoS attacks in the Internet. All other available data measures DDoS activity only on a small portion of the Internet.

CAIDA, an organization devoted to measuring important characteristics of the Internet, had a large space of unused network addresses that could be configured for this purpose, so the authors of this paper performed a large study of the traffic received at these addresses, analyzing the data in various ways to obtain insight into the DDoS phenomenon. Their data, gathered over the course of three weeks in 2001, remains the best large-scale, Internet-wide description of DDoS attacks.

Full details of the study and its results can be found in [MVS01], but we will repeat their major results here. Over the three-week period, they observed over 12,805 separate attacks on more than 5,000 different targets in more than 2,000 DNSdomains. The largest attack they observed contained more than 600,000 packets per second, an immense number of packets that could not be handled by most machines in the Internet.

Table C.2. DDoS attack distribution by protocol, from the paper "Inferring Internet Denial-of-Service Activity"

Protocol

Percentage of attacks

Percentage of Attack Packets

TCP

91.8

66.00

UDP

3.3

0.25

ICMP

2.3

33.66

Protocol 0

0 2.2

0.06

Other

0.4

0.03


The study's traces were gathered in three segments, each lasting around one week. They typically observed 20 or more attacks per hour. During one hour-long period, they observed 150 attacks. In this case, most of the targets were on a common network.

The technique also allowed the researchers to deduce the type of packets used in the attack, by characterizing the type of response. Around 60% of all responses were TCP, some being SYN ACKs (probably indicating a SYN attack), some being TCP RSTs or RST ACKs (probably indicating a TCP-based attack that sent unexpected TCP packets), and a few being some other kind of TCP packets. Thirty-seven to forty percent of the responses were some kind of ICMP packet. The largest-volume attacks seemed to generate ICMPTTL exceeded responses. The authors were unable to identify exactly what mechanism was being exercised in these attacks.

From this and other data, the authors were able to deduce the protocol used in the attack. Table C.2 (derived from Table 5 in their paper) summarizes these results.

Thus, while the majority of both attacks and attack packets are TCP-based, other protocols are in use for attacks. For this data, particularly, one must bear in mind that the study does not capture information on DDoS attack packets that do not generate responses. So, for example, if a particular UDP streaming protocol does not cause the target to generate a response denying improper packets or querying the source, all attack packets using that protocol would be unaccounted for by this methodology. So this data cannot tell us the frequency of DDoS attacks that use packets that do not generate responses.

The backscatter technique also allowed the researchers to determine the duration of attacks. The methods they used to deduce durations (which are related to the methods used to determine when attacks started and stopped, which in turn were used to count overall number of attacks) are described in the paper. The results, in brief, were that most attacks were short. Fifty percent lasted less than 10 minutes, 80% less than 30 minutes, and 90% less than one hour. But there were some very long attacks. One percent of the attacks lasted more than 10 hours, and dozens of attacks went on for several days. These results should suggest to you the danger you are in if you cannot respond to a DDoS attack. It need not stop on its own, unless the attacker wants it to. For practical purposes, if you cannot stop it, you are at the attacker's mercy indefinitely.

The researchers were also able to shed some light on what kinds of sites were suffering attacks. The response packets they received had the target's IP address in the source field, allowing the researchers to attempt a reverse DNS lookup on that address.[2] Not all addresses resolve under reverse lookups, and a bit less than 30% of the addresses the researchers observed proved not to be resolvable in this way. Of those that were resolvable, many appeared to be home machines using dial-up or DSL connections (deduced by particular strings appearing in the names of these machines). Two to three percent of the attacks were targeted at name servers of various types, and 1 to 3% of the attacks were targeted at routers. Thus, attacks on the infrastructure are a reality, and are not limited to the single case of the October 2002 attacks on the DNS root machines.

[2] Unlike a normal DNS lookup, which translates a human-readable name to an IP address, a reverse DNS lookup translates an IP address to a human-readable name. The DNS service supports both forms of lookup.

The top-level domains (TLDs) of the machines under attack showed a broad range of targets, with machines in the .net and .com domains receiving a lot of attacks. This analysis did turn up the surprising result that machines in Romania (identified by the .ro TLD name) were attacked nearly as often as .net and .com machines, and that Brazilian machines (with the .br TLD name) were also very commonly targeted. The authors did not have a good explanation for the frequency of attacks on targets in this country.

The full paper provides many other interesting measurements. One more we will comment on here is the degree to which particular nodes were subjected to multiple attacks. The study found that nearly two thirds of the nodes attacked were attacked only once in their traces. Nearly a fifth of the nodes were attacked twice. However, a few unfortunate nodes were attacked dozens of times, and one unlucky node received more than 100 attacks. The lesson is that a determined adversary may return to his attack, even if you take temporary measures to stop him, as has frequently been seen in practice. The best way to minimize damage from repeated attacks is to develop well in advance the monitoring and response strategies (such as mentioned in Chapter 6) that you can easily engage over and over again. While automated defenses may fend off simple attacks, defeating a determined attacker who repeatedly targets your network will require all the skill and attention of your operations staff.

Once the technique of backscatter was described, other entities began to perform their own studies using the technique. CAIDA was particularly well positioned to perform this study, as they own a lightly utilized /8 network that represents 1/256 of the total IP address space. Most others running such experiments (an exception being the University of Michigan's Internet Motion Sensor; see http://ims.eecs.umich.edu/) have far fewer addresses that they can configure to receive backscatter traffic, increasing the noise in their measurements.

Relatively little other backscatter data has been reported. One recent paper [GMP04] discussed applying spectral analysis statistical techniques to backscatter data to determine a number of characteristics. In particular, this study looked at determining the actual start and end times of attacks, looking only at TCP SYN ACK backscatter packets. They did not concentrate on providing the same kind of analysis as the earlier paper, so the results are not directly usable to confirm the findings of [MVS01].



Internet Denial of Service. Attack and Defense Mechanisms
Internet Denial of Service: Attack and Defense Mechanisms
ISBN: 0131475738
EAN: 2147483647
Year: 2003
Pages: 126

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net