Common Dos Attack Techniques | HACKING EXPOSED WEB APPLICATIONS, 3rd Edition

DoS attacks have changed over the years as attackers have adapted to changes in technology and defenses put in place. In the early days of the World Wide Web, when users began to connect systems to the Internet in large quantities , the DoS attacks that gained popularity and notoriety were those that exploited off-the-shelf (OTS) software vulnerabilities (we include freeware, open -source, and commercial software in this definition). OTS vulnerabilities are actual bugs (also known as "features") in the software or protocol, that leave an opening an attacker can exploit. Over time, most of the bugs in the network stacks of operating systems have been fixed and mitigations or replacements have solved protocol issues. This left attackers with a need to find new areas to explore when attempting to deny service.

Attacks on the Internet today are most often focused on overwhelming the capacity of a site using large numbers of requests or hogging limited resources. They take advantage of a fundamental truth of the Internet architectureno web site or server farm can handle the traffic if every client on the Internet attempted to simultaneously access it.

Old School Dos: Vulnerabilities

The common thread of most attacks during the early age of the Internet was that they took advantage of the network stack, the software code used by an operating system to handle processing of network traffic. Each layer of the stack handles a different layer of the traffic. Attacks took advantage of the fact that operating system stack writers expected systems to follow the protocol spec when communicating. Vulnerabilities typically come about when assumptions are made as to how the traffic will appear, the programmer expects the data to look one way or expects the processing to occur in one fashion and the attacker presents things differently. Here is a selection of old vulnerabilities that can still be seen occasionally but have almost all been fixed by modern operating systems:

Oversized Packets One of the earliest DoS attacks. The most common form is the "ping of death" attack, ping -l 65510 192.168.2.3 on a Windows system (where 192.168.2.3 is the IP address of the intended victim). Another example includes jolt.c, a simple C program for operating systems whose ping commands won't generate oversized packets. The main goal of the ping of death is to generate a packet size that exceeds 65,535 bytes, which caused some operating systems to crash in the late 1990s.
Fragmentation Overlap By forcing the operating system to deal with overlapping TCP/IP packet fragments , many suffered crashes and resource starvation issues. Exploit code was released with names like teardrop.c, bonk.c, boink.c, and nestea.c.
Self-referenced Packet Loops This approach used TCP/IP packets with the victim's IP address in the source field as well as in the destination field (these went by the names Land.c and LaTierra.c).
Nukers These attacks were related to a Windows vulnerability of some years ago that sent out-of- band (OOB) packets (TCP segments with the URG bit set) to a system, causing it to crash. This attack became very popular on chat and game networks for disabling anyone who crossed you.
Extreme Fragmentation TCP/IP by its nature can be fragmented into segments as determined by the sender. By setting the maximum fragmentation offset, the destination computer or network infrastructure (victim) can be made to perform significant computational work reassembling packets. The jolt2.c attack was based on sending a stream of identical packet fragments.
Combos To save time figuring out which of the myriad different malformed packets a victim might potentially be vulnerable to, some hackers cobbled together scripts that simply blasted a target with all types of known DoS exploits, in many cases leveraging the canned exploits we've just covered (jolt, LaTierra, teardrop, and so on). We've used combo tools like targa and datapool effectively in the past (against authorized targets, of course!).

As we noted in our introduction to this chapter, most if not all of these vulnerabilities have been patched for several years now, and for the time being, it doesn't look like this flavor of DoS will re-emerge as a serious threat anytime soon. Unfortunately, as we will see in the next sections, malicious hackers have more effective DoS techniques to turn to.

Tip	To download the tools above and many more like them, try http://www.antiserver.it/Denial-Of-Service/.

Modern Dos: Capacity Depletion

As operating system designers got smarter and the protocols that run the Internet became better tested and more standardized, it became harder and harder for hackers to find vulnerabilities or systems that had not been patched against network stack issues. Since they were not about to give up the fun of attacking networks and taking down Web sites, they moved from attacks that confused and crashed the operating system to attacks that simply made the network or servers work too hard.

All Web sites are designed around a certain level of capacitythe hardware, software, and network links dictate how much traffic the site can support. Take as an example a Web site with one server, supporting 100 simultaneous sessions, connected over a T1, a 1.544 Mbps link. If an attacker creates 100 sessions connected to the server, then it will not be possible for a valid user to reach the server, hence service is denied . If the attacker generates 1.544 Mbps of random traffic and fills up the network connection, no traffic from a valid user will reach the site or they will do so incredibly slowly.

Although the final effect is roughly the same, attacks on infrastructure like network devices, servers, and off-the-shelf server software have historically been more common, since attackers obviously get more bang for the buck by bringing down widely deployed technology. More recently, customized attacks on unique application logic (such as Google's search algorithm) have been seen in the wild and are sure to become more common as infrastructure becomes better hardened and attacks on it more difficult.

The basic approach of capacity depletion DoS is to simply blast a high volume of traffic at the targetusually with the following twist: since the effect of brute packet-blasting is self-limited by the attacker's own capacity, hackers have to exploit weaknesses at the target or within the TCP/IP protocols themselves to magnify the effect of their floods and thus create resource consumption asymmetry with the target. In simple language, the attacker attempts to use few of their resources to trigger massive resource consumption in a target. In this section, we'll discuss some of the clever mechanisms most commonly used by attackers to achieve this amplification effect.

SYN Floods

SYN floods are the simplest and most common form of network DoS attack. The attack sends a flood of SYN packets (the initiatory packets for TCP connections) to initiate connections to the remote service. The purpose of the flood is two-fold: the first goal is simply to use up the downstream bandwidth of the site being attacked . A web site hosted by a T1 connection has a bandwidth of 1.544 Mbps; if the site is receiving a flood of SYN packets using up 1.250 Mbps, valid users will have to squeeze by with the remaining .290 Mbps, slowing them down to a crawl.

Here's the "twist" that amplifies the DoS effect: the second goal is to use up the connection handling capacity of the target server(s). Servers typically allocate a TCB (transmission control block) to store information about the connection (source and destination ports and addresses); this is a structure stored in the server's memory. Server memory is a finite resource and enough connections can potentially use up all the available memory or cause the system to start rejecting connections to prevent memory from running out, both of which serve the attacker's purpose.

Since SYN flood packets don't require a response to be effective, SYN floods are typically implemented using spoofed or random source IP addresses, making it difficult to track them back to the perpetrator. A TCP SYN packet is also the smallest valid TCP packet that can be sent requiring little processing or memory usage on the part of the attacker. SYN packets are also very common and are one of the building blocks of TCP communication. As every connection needs SYN packets to initiate communications, the malicious SYN packets cannot be easily filtered without preventing all connections, even the legitimate ones. Luckily, SYN floods are easy to detect and can be absorbed if enough bandwidth is available, or they can be filtered using techniques and/or products we'll outline later in the section on DoS countermeasures.

One of the earliest well-known SYN flood attacks occurred against the Web hosting company WebCom back in 1996. The attack repeated the pattern of the first documented DoS attack against Panix.com earlier in the year, and shortly after Phrack and 2600 had published articles on the technique. During the attack a compromised computer at Malaspina University-College in British Columbia, Canada, sent SYN packets at an estimated rate of 200 packets per second against the hosting server. For a period of 40 hours the sites hosted by the server were essentially unavailable as the company and ISPs attempted to trace the attack.

There are dozens of tools in common use to generate SYN floodsstand-alone tools like juno and flood2.c, as well as collections like Trinoo and Stacheldracht. Most tools use raw packet libraries that allow the quick assembly of packets, the forging of any field, and sending using raw sockets to accelerate attacks. Microsoft has taken steps against this by disabling raw sockets in Windows XP Service Pack 2. Removing native support in the operating system makes it more difficult (though not impossible ) for attackers to easily write and use tools on zombie machines that have been patched.

UDP Floods

UDP flooding can be implemented in a couple of ways. The most obvious is to simply send a stream of UDP packets to a listening UDP service on the victim system. Since UDP lacks the overhead of its cousin TCP, it's sometimes possible for a single host to generate enough UDP traffic to overwhelm other systems or networks.

The other UDP flooding mechanism more properly demonstrates the amplification effect of DoS. In this version, a flood of UDP packets is sent to a port that is not listening. In response, the "drone" server sends back an ICMP error message. By sending traffic from a spoofed IP address, a stream of ICMP messages from the drone box can be directed against the spoofed target. The amplification effect is achieved by flooding numerous servers with UDP packets containing source IPs with the victim's address, resulting in an ICMP flood of the victim server from the other drones.

As with SYN floods, UDP floods can be spoofed to make it hard to identify the source.

Smurf and Fraggle

The smurf and fraggle attacks highlight a more basic amplification effect, either by causing multiple computers to respond to the same packet or by causing an application service to generate traffic targeting another server.

Smurf abuses the ICMP protocol to generate a flood of packets from an intermediate network against a target. The attacker generates an ICMP message with a spoofed source (the machine to be attacked) and a destination of the broadcast address of the intermediate network. When the packet arrives at the intermediate network, each of the hosts on the network will respond with a reply to the target. This means one packet will generate many packetsvoil , amplification.

Fraggle takes advantage of two daemons running on most UNIX hosts, chargen and echo. The attack sends initiatory SYN packets to each daemon spoofed with the other's address and source. This creates a connection between the two that continuously creates a stream of characters from chargen and then echoes the traffic back when it reaches the other daemon. It operates in the same fashion as the self-referenced packet loop attack described in "Old School DoS: Vulnerabilities" earlier in this chapter, except the traffic is sent to another machine rather than to the same machine.

Distributed DoS (DDoS)

Distributed denial-of-service (DDoS) attacks are the latest take on capacity attacks, with one key difference: the amplification effect is achieved by directly controlling a large army of machines to flood one or more targets. They have received a great deal of mention in the press (most prominently the February 2000 DDoS attacks that disrupted Amazon.com, Buy.com, eBay, E*trade, Yahoo!, and others), and are typically the ones that create the most damage.

So how does a DDoS attack work? The first thing that's required is a large number of systems on the Internet that have been compromised by a malicious attacker, either directly or, more commonly, via malware such as a virus or worm. The compromised hosts run a piece of software that either:

Allows someone to remotely control the victim machine; or
Is preprogrammed to perform some sort of coordinated attack (for example, the Win32.Blaster worm was preprogrammed to launch a DoS attack against Microsoft.com in August 2003).

These compromised machines, also called zombies or bots (short for robots , a term applied to automated Internet Relay Chat software agents ), often register themselves by connecting to an IRC channel. A malicious hacker then joins the channel and issues commands to the zombies/bots. Often, layers of master control servers (themselves compromised to further launder connections) may be used to control the infected zombies/bots. Figure 11-1 illustrates a common DDoS attack setup, showing how a single attacker can orchestrate thousands of machines in a coordinated attack against one or more sites.

Figure 11-1: A common Distributed denial-of-service (DDoS) attack configuration

It is widely known that there are so-called bot "armies" or botnets available on the Internet today that can be leveraged to perform such attacks. There is even evidence that such bot armies are being bartered among the attack community at commodity rates. Some estimates of the extent of some botnets exceed a million machines. Some simple math illustrates that even a mere dribble of traffic orchestrated across so many machines could bring down just about any site on the Internet today. DDoS remains a loaded gun pointed at the Internet, waiting to go off at the misfortune of some or many online businesses.

Note	More information on common bot software, how clients are infected, and how these infection spread can be found in Chapter 10.

Application-Layer Dos

As denial-of-service attacks targeted at infrastructure have become more common, more work has been performed by administrators to protect against these attacks and mitigate them as best as possible. Subsequently, attackers have traveled further up the network stack to attack applications themselves. In contrast to infrastructure which by our definition includes common (not necessarily commercial), off-the-shelf (COTS) technology, such as the networking devices that connect the site to the Internet, the operating systems that host the Web server software, the Web server software itself (if it is a COTS product like IIS or Apache), and even potentially COTS modules like news forum or Web guestbook packageswe consider application-layer components to be anything that is unique or custom to a particular site or application. For example, Google's search engine logic would be considered application-layer.

The typical dynamic Web application is based on a three-tier architecture: a presentation layer, usually comprised of static content (images, files); a middle tier (often an application server hosting business logic and processing dynamic content); and data tier , made up of databases, LDAP directories, and so on. The more tiers involved in handling the request, the longer it takes and the more resources that are consumed. A request to download an image only requires some basic processing by the Web server. A dynamic page that, for example, performs a calculation on data provided by the user, requires resources on the Web server and application server as the application code processes and generates a result. Finally, a request that requires data retrieved from a datastore, uses the resources of all three tiers. By their very nature, the more tiers a request uses, the more resources that are consumed and the fewer users the infrastructure and application can support. For example, a small Web application might be able to support 100 simultaneous static requests, 20 dynamic requests, or 10 deep requests that pull data from a database.

Much like a burglar will study a house they plan to burgle, attackers will case an application looking for resource intensive pages. These pages often have long load times or perform complicated processing tasks . Typical examples of these pages include search pages that work on un-indexed content, pages that return database content that results from multiple table joins (table joins are a database task that is often very intensive in resource usage), and encryption handling. One of the most common errors that Web applications make is to accept arbitrarily long input when performing encryption. This allows an attacker to supply large amounts of input that must be processed using computation-heavy encryption routines.

The resources that applications use and attackers will try to consume are processor, memory, storage, and shared resources like database connections, files, user logins, or other application resources (RPC, network ports, threads, sessionID, etc.). Let's look a little closer at how these are exploited in a DoS/DDoS attack.

Processor Processor usage in Web apps is most frequently tied up during long mathematic computation tasks, encryption or decryption of data ( specifically public key cryptography, which is much more intensive than symmetric encryption), and complex textual searches.

Memory Just about every operation performed by a Web application requires memory. Operations that receive arbitrary- sized data from the user, another service, or the database, are especially vulnerable to using up excessive amounts of memory. Running out of memory is rare in these days of virtual memory, but significant performance hits and slowdowns are a frequent occurrence.

Database Connections To improve scalability, most Web applications use a database pool to allow multiple threads to share a limited number of connections to the database. These pools are implemented by the most common database access APIsODBC and JDBC. Requests that use the database tie up these limited connections. Transactions that involve complex locking and resource handling are very prone to tying up database connections.

A good example of this is a multistep purchase or user registration that is spread over a number of Web pages. A Web application may add a new row to the database when the user submits the first page, lock it, and then update it as requests from subsequent pages are made until the final submission page, where the record is considered complete and the lock can be removed. If the lock is only on the row containing the record, it is likely that other transactions may be performed concurrently. On the other hand, complex transactions may entail locks being held on multiple resources and prevent concurrent transactions or cause resource starvation. If such transactions are accessible to unauthenticated users, it makes it easy for attackers to exploit and also limits response options like account deactivation .

User Login Applications that implement their own login functionality and support user lockout can be prone to allowing attackers to brute-force usernames, allowing an attacker to lock out large numbers of users. The same threat can occur where companies use a predictable naming scheme, publish a corporate directory, or are exposed by a disgruntled employee. If the application makes use of a third-party authentication system like RADIUS, TACACS+, etc., brute-forcing of logins may result in the authentication system being tied up, preventing regular users from logging in. Some Web applications make it very easy to create new user accounts; an attacker may try brute-forcing account creation to make it difficult for new users to register with the application. This also takes up space in the database or wherever the user accounts are stored, and if limits are set, this may block all new user creation.

Note	Discussion of mitigations for each of these categories can be found next.

Now that we've looked at some of the ways that DoS conditions can be created, let's look at some concrete examples.

Google July 2004 DDoS

Popularity:	3
Simplicity:	3
Impact:	6
Risk Rating:	4

Attack A great example of an application-layer DDoS attack is the Google MailTo: denial-of-service attack of July 2004. The MyDoom-O worm used Google and other search engines to spread by querying them for e-mail addresses they had found while crawling the Internet. The worm would spread by sending a copy of itself to every e-mail address identified. As the worm spread, more and more queries for e-mail addresses slowed the service to a crawl and denied service for many users. While the worm was not targeting the search engine itself, an attacker probing a search engine like Google or similarly complex 3-tier application would find that sending queries typically takes x milliseconds to return a result. Sending a less common or more complex query might take 2 x milliseconds, and sending a really complex query might take 4 x milliseconds.

Seeing this and graphing out a number of queries, the attack would yield a three-humped distribution curve like that shown in Figure 11-2. Analyzing the results in light of the typical 3-tier Web application architecture, the attacker would assume that the system uses two levels of indexes (caches, really) before reaching the final data tier. Knowing this, the attacker knows that a query that misses each index would take up far more resources than one that hits the first index. The indexes are in place to limit the number of queries that need the full resources of a "deep" query. In contrast, common search queries like "Britney Spears" would hit the first index and provide a result immediately.

Figure 11-2: The three-humped distribution graph that might result from analyzing Web search engine query results.

The attacker focused on finding a way they could force all their queries to miss the first two indexes to use up the most resources. If they could come up with an easy way to force all queries to miss the indexes, they could send a series of such requests (potentially only a few if the third tier was exceedingly compute-intensive!) and prevent the application from responding to such requests.

phpBB DoS Vulnerabilities

Popularity:	3
Simplicity:	3
Impact:	6
Risk Rating:	4

Attack For an example of a large, complex Web application DoS vulnerability, let's take a look at phpBB. phpBB is a popular bulletin board service, an open-source project running on a choice of database platforms (MySQL, PostgreSQL, or Access/ODBC). As the project has evolved, attackers and security testers have found numerous denial-of-service vulnerabilities.

In 2002, a vulnerability was discovered with the BBCode functionality that the BBS implemented. BBCode is simplified markup language (reduced form of HTML) that the BBS provides users to allow them greater control of the formatting of their posts without allowing them unrestricted use of HTML. Security testers discovered that the use of nested tags would trigger a bug in the application.

An attacker could submit,

 [code]  [code]\0\0[/code] 
  [code]\0\0[/code] 
 [/code]

This would be processed by the functions.php, which would expand it to

 [1code]  [1code]\0\0[/code1][1code]\0\0[/code1] 
  [1code]\0\0[/code1][1code]\0\0[/code1] 
 [/code1][1code]  [1code]\0\0[/code1][1code]\0\0[/code1] 
  [1code]\0\0[/code1][1code]\0\0[/code1] 
 [/code1]

The more \0 characters between the code tags, the more copies of [1code][/code1] and the more \0 s within each set of tags when processed. To cause the process to spin on the CPU, the attacker could instead of \0 submit:

 [code]  [code]\0[code]\0[code]\0[/code]\0[/code]\0[/code] 
 [code]  [code]\0[code]\0[code]\0[/code]\0[/code]\0[/code] 
 [code]  [code]\0[code]\0[code]\0[/code]\0[/code]\0[/code] 
 [/code]  [code]\0[code]\0[code]\0[/code]\0[/code]\0[/code] 
 [/code]  [code]\0[code]\0[code]\0[/code]\0[/code]\0[/code] 
 [/code]

With code tags containing \0 now embedded inside the original code tags, these tags will recursively get expanded and then expanded again ad infinitum . This bug would corrupt the database preventing future writes to the database and cause the application process to spin and use up memory, causing 100% CPU utilization. As a result of the attack the Web server process would need to be restarted to clear its state and the database would have to be repaired before the application would be usable once again.

In 2005, three new issues showed up on the radar with phpBB. The first is a CPU denial of service caused by wildcard-only searches. The search engine provided by the bulletin board service indexes content longer than three characters; attackers found that by doing wildcard queries or queries of only one or two letters , it was possible to use up significant CPU resources. Search queries for terms like "aa" or "ab" did not hit the index and as a result caused a major performance hit on the application.

The second issue was an exploit that allowed for arbitrary scripts to be uploaded and executed on the server using phpBB. This exploit allowed phpBB to be turned into a zombie and used as DoS platform much like the worms just described. The final resource-consumption attack found is actually more of a configuration issue than an actual design flaw. The phpBB software provides a CAPTCHA-style requirement for users to create logins; however, if the setting is not turned on, an attacker can generate accounts in an automated fashion very easily and fill up the user table of the application. CAPTCHA is an acronym for Completely Automated Public Turing Test to Tell Computers and Humans Apart. Also known as human interactive proof (HIP), these tests are ways of automating the testing users of a system to determine if they are a human being or bot. By turning the CAPTCHA check on, an attacker cannot write a bot script to create hundreds of thousands of accounts in an automated fashion because the script will be unable to solve the CAPTCHA proof. See more on CAPTCHA in the upcoming "CAPTCHAs and HIP" section, and in Chapter 4..

For more information about the phpBB vulnerabilities discussed here, please see "References and Further Reading" at the end of this chapter.

phpBB DoS Countermeasures

Countermeasure All discussed vulnerabilities have been fixed in current versions of phpBB and the login attack can be mitigated by turning on the CAPTCHA requirement.

Apache Tomcat 5.5 Directory Listing DoS

Popularity:	2
Simplicity:	8
Impact:	3
Risk Rating:	4

Attack Tomcat is a very popular application serveran open-source, Java servlet container. In November 2005, David Maciejak discovered that when performing multiple directory listings of a directory with many files at the same time, it was possible to consume excessive CPU resources on the server. Since the request to generate the attack is a simple directory listing, it would be very easy for an attacker to simply use a standard Web testing tool to multithread numerous requests against the Tomcat server. The problem is with the basic abstraction of the file system that Java provides and the slow performance that results. More bug information can be found in "References and Further Reading" at the end of this chapter.

Countermeasures for Tomcat Directory Listing DoS

Countermeasure This problem has been fixed in 5.5.13, 5.0.31, and 4.1.32 by disabling directory listings. It is a perfect example of a scenario where there is no easy fix because of architectural constraints.

OpenSSL ASN.1 Parsing Errors DoS

Popularity:	3
Simplicity:	2
Impact:	8
Risk Rating:	4

Attack In 2003, several bugs were found in the OpenSSL Library ASN.1 parser that is, for example, used to read X.509 certificates. These bugs would cause integer overflows, improper deallocation of memory resulting in stack corruption, or reading past the end of the buffer containing the certificate. In each case, this would cause a crash of OpenSSL and the application using the library. A follow-up test by Novell discovered another issue that affected Windows systems using OpenSSL where certain ASN.1 sequences would trigger a long recursion that is not properly handled. More bug information can be found at the links listed in "References and Further Reading" at the end of this chapter.

Countermeasures for App-layer DoS

Countermeasure A patch for these problems was released in OpenSSL 0.9.6l. Note that due to the large number of applications that use the OpenSSL library, there are numerous other patches released by vendors for their products that integrate OpenSSL.

More generally , development platforms like Java and C# that provide memory management are much more resistant to memory resource starvation. Since the application does not have to handle the deallocation of resources and the VM or CLR are built to support memory allocation failures robustly, applications written on the platforms will be more robust against these resource attacks. These platforms also support native threading, locking, and resource-sharing models as well as providing the data structures necessary for throttling or fairly prioritizing workloads.

Often, the best method for dealing with denial-of-service attacks is to address site areas that have slow performance. For example, site login is one of the most common functions on many sites and the logon function can be slow, often requiring database lookups that an attacker may exploit. A technique that has been used on large commercial sites to successfully deal with this problem is using LDAP rather than a SQL database for storing user records. LDAP is a lightweight protocol developed specifically for accessing user directories. Another advantage of this technique is that attacks against user login will not affect other services that rely on SQLthis is an example of segmenting/siloing site features to reduce resource consumption.

Many sites use cookies containing encrypted data to store session state on the client rather than the server. This can be done for performance reasons, or as the memory requirements of server-side state storage or the method of load balancing or clustering being used. Sites that do this well cap the size of the encrypted cookie and use an algorithm tailored to the application. A site that uses cookies only during a single logon session can use a weak but fast algorithm like RC4. Sites that leave permanent cookies on the user's system use much stronger algorithms like TripleDES or AES. Setting proper expiration dates on cookies and limiting their growth prevent attackers from forging bogus cookies to use up decryption or execution resources.

Denial-of-revenue Attacks

Popularity:	5
Simplicity:	5
Impact:	8
Risk Rating:	6

Attack The term denial-of-revenue attack (DOR) appears to have been used at least as far back as 2003, but never really came into vogue . The concept loosely refers to an attack where Web application logic is usurped to redirect monetary costs or compensation to inappropriate parties (thus, it might be more appropriately termed a "monetization misdirection " class of attacks). The most common example of a denial-of-revenue attack is Internet advertising click-fraud, where an automated program or sweatshop worker, possibly in another country, continuously clicks on links provided to drain advertisers' budgets . A sample scam of this nature is illustrated in Figure 11-3.

Figure 11-3: A typical click-fraud scheme

As you can see in Figure 11-3, the attack relies on workers who spend all their time clicking on advertising links. This can abuse the system in two fashions : one, it drains advertising money out of the account of the advertiser for ads that are not truly reaching a valid audience; and two, it can generate revenue for a site-hosting advertising content by making it appear that the advertisement is getting more views than they actually are. In the second case, the site would be some content site rather than a search engine like Google.

There are a couple common forms of click-fraud. The first is the use of offshore laborers in a country like India or China who spend all their time clicking advertising links to generate revenue. The second is the use of automated scripts or bots that automatically "click" advertising links to generate paid hits. An entire economy has arisen around these basic techniques, and legitimate advertisers may have no clue that shady third-party affiliate organizations are engaging in these activities to deliver results to their clientele.

Most people think that click-fraud is limited to search engines and advertising affiliate networks, but the fact is that many services provided by sites can cost money and be attacked. For example, digital media (music, video) licensing, SMS messaging, even direct-mailing all cost money and could be abused by an attacker. Basic user registration is also a frequent target of this nature; consider an automated system that sends a catalog via postal mail to anyone who signs up could find an attacker has signed up millions of invalid addresses.

Denial-of-revenue Countermeasures

Countermeasure Addressing these sorts of attacks depends a great deal on the unique application under siege. We'll provide some generic advice and then address more specific examples like click-fraud in an effort to illustrate broader considerations.

The best way to mitigate application-specific attacks is to perform good threat modeling throughout the lifecycle of the app. We talk in more detail about threat modeling in Chapter 13. In essence, the only way to prepare for attacks of this nature is to embed adversarial thinkingat both business and technical levelsthroughout the culture and processes of the app management, development, and test teams . Some of the key things to consider in a threat model that are relevant to denial-of-revenue attacks include the following.

Technical Versus Nontechnical Threats Programmers are usually focused on meeting technical requirements rather than fully grasping the economic model behind the service offering. Take for instance a service that is supposed to provide free samples of the first 30 seconds of a song but require payment for the full song. The programmer could design a system that takes an index into the file and plays the next 30 seconds. The Web site would always offer songs with an index of zero to start playing at the beginning, but an attacker who tries changing the index will find they can make repeated requests with different indexes to collect the whole song. In this case there is nothing technically wrong with the technique (not smart, but it works), and it might even make sense if the same application was also doing streaming radio or advertising mixes .

Never Trust the Client Attackers like to lie, impersonate, and clone identities when performing an attack. It is important to make sure you always know who is performing a given action on the site. This means there must be some method of verification of the user and that there is no way for one user once authenticated to act as another user. This also must be looked at across trust boundaries. For example, most advertising links on Web sites actually redirect the user to an ad server that records the click before redirecting the user to the site they were interested in. How can that ad server trust where the client came from to pay for the advertised link?

Any site that derives revenue from an advertising affiliate network should be concerned enough to make sure that click-throughs from their site are valid. An advertiser is not going to pay for invalid clicks, and an attacker who can throw the sites clicks into doubt may cause the affiliate network to withhold payment. Another possibility for advertisers is to support fee-for-sale, which typically favors the advertiser over the sites providing advertising.

Other services that are provided for free to users but cost money to the site also need to be carefully reviewed. Examples are a site that offers free music to users but must pay a royalty for every time it plays, or one that allows users to send SMS messages but must pay a small fee ( micropayment ) for each message to a phone company. Either of these might potentially be abused by an attacker to cost the site money.

CAPTCHAs and HIP To prevent user registrationbased denial-of-revenue attacks, many sites today use CAPTCHAs and Human Interactive Proof (HIP) technologies (see Chapter 5 for more information).

CAPTCHAs and HIPs are also a great place to locate resources to use in a denial-of-service attack. Both technologies use a great deal of computation to produce the challenge; as a result they are typically precomputed and stored for future use. An attacker who uses them up will be able to prevent access to the site until new challenges can be calculated. Many CAPTCHAs are also weakly implemented, making it easy for automated systems to defeat them. A CAPTCHA may use a constant font, aligned glyphs (characters), constant rotation, no deformation or stretching of the image, constant colors, predictable character/dictionary set, etc. This renders their protection useless and reopens the threat of user registration attacks.

Note	Making a CAPTCHA is an art, not a science; the image must be difficult for machines to accurately process yet still be easily readable by human beings. It makes no sense to provide a CAPTCHA that consistently defeats your human users.

This is a perfect demonstration of how putting a countermeasure in place against one attack can actually lead to a new or different one. It also shows how putting a mitigation in place does not mean you can forget about the threat; the mitigation may fail or be of illusory benefit. Hence, threat modeling must be performed repeatedly and not just a single time during the development process.