Testing HTTP | Hunting Security Bugs

The telnet example demonstrates a generic way to proxy TCP traffic. Similar but easier ways to proxy Web traffic are covered in this section. HTTP (used for Web traffic) is a commonly used protocol for network applications. Because it is so widely used, it is worthwhile for you to invest in understanding specific tools and attacks that can be applied to this protocol.

More and more often, applications are written that target a Web browser as the thin client of choice. Examples of Web applications include search engines, online banking, and Web-based e-mail. Often, these Web-based applications are susceptible to malicious client attacks because Web applications contain some of the most vulnerable code for three reasons: their very nature is to accept network requests from untrusted sources over the Internet, the application code is developed and deployed quickly, and many Web developers are not familiar with secure coding practices. This section discusses how these Web applications actually work.

Understanding a Stateless Protocol

Unlike many TCP protocols, HTTP is stateless, which means that each request is an independent transaction. Although Web developers might expect requests to come in a certain order, this is not guaranteed with a stateless protocol. For example, a Web site might ask the user to fill out a registration form before presenting the user with the link to download a file. However, HTTP does not guarantee that requests are sent in this order. In this example, if the user knows the URL of the file to download, the user could specify the URL in the browser and download the file without filling out the registration form. Attacks in which the URL is requested directly are known as forceful browsing.

Testing Methods of Receiving Input

Like most applications, Web applications accept input from the user, process that input, and provide output to the user. Following are four common sources of input for Web applications:

URL query string
HTTP POST data
HTTP cookies
HTTP headers

Before we address how you can test Web application input sources, let s discuss a related topic of HTML forms. Because HTML forms accept input directly from the user, they can become specific targets of attackers , who will probe them for weaknesses while they attempt to send malicious data to a server.

Understanding HTML Forms

HTML forms are a common format in which applications accept direct user input that is later sent URL query string data or POST data. Figure 4-5 shows a sample form. Listing 4-1 shows the HTML for this form:

Figure 4-5: Simple HTML form example

Listing 4-1

   <HTML> <HEAD>    <META http-equiv="Content-Type" content="text/html; charset=utf-8">    <TITLE>Example Form</TITLE> </HEAD> <BODY> <FORM name="myForm" action="http://www.example.com/register.cgi" method="GET"> <TABLE> <TR><TD>Name</TD><TD><INPUT name="Name" type="text"></TD></TR> <TR><TD>Age:</TD><TD><SELECT name="age"> <OPTION value="0">Under 21</OPTION> <OPTION value="1">21-30</OPTION> <OPTION value="2">31-40</OPTION> <OPTION value="3">41-50</OPTION> <OPTION value="4">51-65</OPTION> <OPTION value="5">Over 65</OPTION> </SELECT> </TD></TR> <TR><TD>Email:</TD><TD><INPUT name="email" type="text"></TD></TR> </TABLE> <INPUT type="submit" value="Register"> </FORM> </BODY> </HTML>

In Listing 4-1, a few key items should be noted:

action This property of the form specifies where the form data will be sent when it is submitted by the browser. Line 7 of the code listing specifies that data be sent to http://www.example.com/register.cgi . If an action is not specified, the form will submit data to the same URL that was used to load the form.
method This property specifies how the data should be sent to the form s action. The sample form uses GET. The method can be either GET or POST:
- GET method The GET method sends all of the form data as part of the form action URL. This is accomplished by appending a question mark to the end of the form action followed by the form data in key/value pairs. Ampersands separate the elements of the key/value pairs. For example, if the sample form is filled out using Rob Barker as the name, Under 21 as the age, and rbarker@alpineskihouse.com as the e-mail address, theform would be sent to the server as http://www.example.com/register.cgi?Name=Rob+Barker&age=0&email=rbarker@alpineskihouse.com . The question mark and everything following it is known as the URL query string.
  
  One advantage of using the GET method is that it enables the user to save the results of the form submission. For example, most search engines use the GET method when performing searches. This enables the user to bookmark and/or send other users the URL of the search results.
  
  One disadvantage of using the GET method is that any personal information included in the form data might be stored in unsecure places, such as in the browser Favorites list or History file or in proxy logs, and this might not be obvious to the user. For example, if a user logs on to an e-commerce site and then decides to e-mail a link to a friend, the link might include a session identifier for the currently logged-on user. Ifothers use this URL, they will be logged on to the site under the first user s account. Another disadvantage of using the GET method is that the length of form data is limited to the maximum URL length because the form data is sent as part of the URL. The maximum URL length is not defined in the HTTP specification and varies in different applications.

POST method The POST method sends form data in the same key/value pair format delimited with ampersands as does the GET method. But instead of using the URL to send the form data, the POST method sends the data at the end of the HTTP request. This has the following advantages and disadvantage.

One advantage is that data is not stored as part of the URL, so it won t be stored wherever the URL is saved. Also, there is no limit on the length of the form data.

One disadvantage is that the form data is not included as part of the URL, so form submissions, such as the result of a search query, cannot be stored and shared as aURL can be.

Omitting the method property in the form means the form will use the GET method.

Form Controls Many controls can be used to collect input from the user, including text boxes, password boxes, list and combo boxes, and radio buttons . None of the controls encrypt data entered by the user.

The password input control does mask the data entered by displaying asterisks or bullets instead of the characters entered by the user. This only prevents a password from being revealed to someone looking at the screen as or after the password is entered. Sensitive data should be encrypted before it is sent over the network. Secure Sockets Layer (SSL) is the most common way to encrypt data for Web traffic. The password form control includes HTML like the following:

 <INPUT name="userPasswd" type="password">

Important

Using SSL doesn t make an application secure. SSL is used to encrypt network traffic so that third parties cannot see or tamper with the data (known as a man-in-the-middle attack). It can also guarantee that the client is talking to the correct server (if the client does the validation). However, it does not protect data from being viewed or tampered with at either the client or server end of the communication. For this reason, applications that use SSL can still be attacked . Attackers like when SSL is used because only the server receiving the data can see the request data, and that makes an attack a little more difficult to notice.

Another form control of interest is the hidden input control. A hidden control is just like the text box input control except the hidden control doesn t have any visual representation on the HTML page. It is hidden from the user. Web developers can use hidden controls to pass information without showing the information to users or giving users a way to modify the information through the Web page. Hidden form controls include HTML similar to the following:

 <INPUT name="myState" type="hidden" value="someValue">

Tampering with URL Query String Parameters

As discussed in the section titled Understanding HTML Forms, you can append data by adding a question mark to the end of the URL. The question mark and everything following it are considered the query string. In addition to HTML forms, data is sent in the query string by JavaScript in a Web page, by client-side applications, as part of hyperlinks , and from many other places.

As discussed earlier, data in the query string often is sent as part of the URL. It is common to see URLs like this one: http://example.com/displayOrder.cgi?trackingID=103759 . As a tester, you likely recognize this as an opportunity to perform some tests using the trackingID part of the URL. What might be an interesting trackingID value? Adding to or subtracting 1 from the valid trackingID would be a good place to start. This might show another customer s order information; performing this test on many shopping sites really works. In February 2005, someone reported that by adding 1 to the identifier in the URL used to view his W-2 form (U.S. wage and tax statement) he was able to see other people s income statements ( http://news.zdnet.com/2100-1009_22-5587859.html ). A properly designed order tracking system would require users to log on and would allow users to see only their own orders, or it would ensure the tracking identifier has a significant degree of entropy so that the ID is not easily guessed or manipulated with brute force to view someone else s identifier.

Tip

HTTP contains a header named Referer, which contains the URL of the page that referred the browser to load the current URL. If the URL is loaded directly in the browser, the Referer header is not sent. To stop attacks in which attackers generate their own URLs, some developers check the value of the Referer header to verify that the referring URL is their page. However, this isn t very effective protection because the Referer is sent by the client and its value can be forged by the sender. Also note that the original HTTP design specification misspelled the header s name as Referer. Client and server programmers implemented this header as documented in the specification, and so it is necessary to misspell it for the server to evaluate the header s value.

Tampering with POST Data

As mentioned earlier, the POST data is formed in the same fashion as query string data, but isn t sent as part of the URL. So, unlike tampering with URL query string data, you cannot simply modify parts of the URL to modify POST form data. POST data is used for HTML forms, Simple Object Access Protocol (SOAP) and Asynchronous JavaScript and XML (AJAX) requests (discussed in more detail in Chapter 11, XML Issues ), custom solutions, and many other purposes. Following is an example of how to modify POST data.

Load the order form example (TicketFormPost.html) from the book s companion Web site into your Web browser. This Web page, as shown in Figure 4-6, is similar to what you might see on a ticket order Web site. You can purchase concert tickets, but there is a limit of four tickets maximum per customer. The Web page allows you to choose the number of tickets to purchase with the largest choice being 4. Customer selections are submitted using an HTTP POST, so values cannot be modified through the URL.

Figure 4-6: Sample HTML form that allows ordering a maximum of four tickets

Think maliciously ”how can you change the number of tickets to purchase to a quantity greater than 4? One option is to save the Web page locally and modify it so that the option to order more than four tickets is available, and then make the modified page submit the order to the ticket order Web site. This approach works in most cases; however, sometimes pages are complicated and contain JavaScript that validates data before it is submitted. Although these issues can be worked through, an easier way to submit an order larger than four tickets exists: you can use an HTTP proxy that allows modification of data before it is sent to the server.

Using an HTTP proxy to modify arbitrary HTTP traffic

Using an HTTP proxy, such as Web Proxy Editor (included on this book s companion Web site), is similar to using a generic TCP proxy, as discussed earlier in the telnet example. Because HTTP has a well-defined way to send data specifically to a proxy server, and because most Web clients can be configured to use a proxy server, using an HTTP proxy is easy. When HTTP data is sent to a proxy, you don t need to specify the remote server name and port number; the client supplies this information and the proxy makes the necessary connections to the remote server. Also, HTTP proxies often include helpful features such as built-in protocol decoders to assist testing.

A major advantage of using the HTTP proxy approach over modifying the existing form is that all headers, cookies, and other state information remain exactly like they were originally ”only the modification differs from the original network request.

The Web Proxy Editor HTTP proxy listens on the port specified in its startup options. This port should be set as the proxy port in the Web client proxy settings used to send requests. By default Web Proxy Editor automatically sets Microsoft Internet Explorer proxy settings, so no additional configuration is needed.

To use the Web Proxy Editor HTTP proxy to modify the POST data from the ticket order form, load the Web Proxy Editor tool and click the Listen button (it looks like a Play button). After Web Proxy Editor has begun to listen for requests, fill out the ticket form to order four tickets, and then click Check Availability And Reserve. The form data sent is displayed in Web Proxy Editor, as shown in Figure 4-7.

Figure 4-7: Using Web Proxy Editor to modify the number of tickets to reserve to exceed the allowed maximum value

Look closely at the data submitted: the Ticketcount form variable has the value of 4, which reflects the number of tickets selected on the order form. Perhaps you have many friends that would like to attend the concert with you, so change the Ticketcount value from 4 to 12 and send the modified data to the server.

The server doesn t validate the number of tickets requested and instead confirms that 12 tickets are reserved, as shown in Figure 4-8. The server code should have verified that the ticket countwas 4 or less but didn t because an option for more than 4 tickets wasn t available on the Web page.

Figure 4-8: Successfully reserving more tickets than allowed by manipulating the normal Web request

Tampering with Cookies

A cookie is a piece of information specified by a Web site that the Web browser or other Web client are directed to store for a period of time. The Web site tells the browser the information to store either through the HTTP header or through client-side script, such as JavaScript or Microsoft Visual Basic, Scripting Edition (VBScript). Once the cookie is issued by the server, all subsequent requests made by the browser to that server include the cookie. (This is a slight generalization and will be clarified in a moment.)

Each cookie has a name, value, and several other properties. Cookies are used for many purposes, including storing information such as logon names and passwords and maintaining state over the stateless HTTP protocol. In this section, we discuss how cookies can be used and some of the possible dangers in their use.

Key Properties of Cookies When issuing cookies, the server and client-side script can set several properties that can be used to expire, secure, and control the scope of the cookie:

name/value The name/value pair is used to store arbitrary information with the specified name.
expires The expires property is used to specify the lifetime of the cookie. The value of expires should be in the format DD - MMM - YYYY HH : MM : SS GMT. If the expires property is not set, the cookie will expire at the end of the browser session and is known as a session cookie. The expires property is also used to delete a cookie by setting the expires value to a date in the past. When it stores authentication information, a cookie should be valid only for a relatively short amount of time (a few hours or less).
path The path property is used to limit when the cookie should be sent. The value is the directory that should access this cookie. For example, if the path is set to /folder/ , all pages (including subdirectories) under the directory named folder will have access to the cookie; however, this is the only directory that will have access to the cookie. It is important to note that path limits the cookie only to pages with paths that begin with the value specified. If the path is set to /folder , pages in the directory folder will have access, but so will pages in a directory named folder2 . To specify an exact directory, the path value should end with a forward slash (/).

domain This property is also used to control when a cookie is sent. Sometimes it is useful to share cookies across several machines in the same domain. For example, site1.example.com might set a cookie that site2.example.com would like to access. This functionality can be achieved by setting the domain property to .example.com . The domain value must begin with a period and must contain at least two periods. The two-period requirement is to prevent someone from setting a cookie to be accessible to all .com domains, .net domains, .edu domains, and so forth. The domain value must also bethe domain used to issue the cookie. Setting this property is optional. The default domain value is the fully qualified domain name that issued the cookie.

Important

A Web site on one domain should never be able to read cookies from or set cookies for another domain. For example, www.example.com should not be able to read a cookie from www.alpineskihouse.com . If cross-domain cookie access is allowed, you have found a security bug. A common vulnerability that allows for this is cross-site scripting, which is discussed in more detail in Chapter 10.

secure The secure property is a Boolean value used to specify whether the cookie should only be sent over a secure channel. This property is optional. The default value is false .

Important

It is important to test to see whether the secure property is set on cookies containing sensitive information that should be sent only over secure channels. A Web site could issue a cookie only over a secure channel, but if the secure flag is not set, an attacker often can trick the target user to browse over an unsecure channel to the site that issued the cookie. For example, attackers can convince a target user to click an http link to the site instead of an https link. The request to the server over the unsecure channel discloses the sensitive cookie information over the network. If the attacker can sniff the target user s traffic, the attacker will know the contents of the cookie.

HTTPOnly If this property is set, client-side script is prevented from reading the cookie. This property is useful in helping protect against cookie theft as a result of a cross-site scripting attack (discussed in more detail in Chapter 10). Currently, this property is supported only in Internet Explorer 6 Service Pack 1 (SP1) and later and is not set by default.

How Cookies Are Issued by the Server Servers usually send cookies to clients in the headers of HTTP responses. Figure 4-9 shows a network capture of a cookie named rootCookie that has a value of Issued by / that was issued by the server. A path (the forward slash [/] in this case) is also associated with this cookie. This cookie could also be issued through client-side script contained in the HTML returned in the document instead of through the Set-Cookie header.

Figure 4-9: Ethereal showing two cookies were issued by the server: rootCookie and ASPSESSIONIDAABSATRT

Retrieving Cookies The Web client checks its cookies, including the domain , path , and secure properties, against any page that it is about to request. If the cookie exists and the domain , path , and secure properties apply to the requested document, the client includes the cookie in the request sent to the server. Figure 4-10 shows a network capture of Internet Explorer requesting a Web site and sending the two cookies set in the previous example (Figure 4-9).

Figure 4-10: Ethereal showing two cookies were sent to the server: rootCookie and ASPSESSIONIDAABSATRT

Testing Cookies It is important to test cookies for the same tampering problems that could be used on query string and POST data. To test cookies, first determine whether the target accepts cookies, and then manipulate the values the Web application expects. Although browsers store many cookies in files, only persistent cookies (cookies with an expiration date sometime in the future) are stored on disk. If you test cookies simply by changing the cookie values in the file, you will miss session cookies. For this reason, use an HTTP proxy as an efficient way to manipulate all cookies.

Note

Noam Rathaus discovered a security bug in PlaySMS ( http://playsms. sourceforge .net/ ) through which he was able to run arbitrary SQL statements (called SQL injection) against the backend database simply by modifying the value of the cookie PlaySMS to contain his SQL statement. More information about this vulnerability is available at http://www.security-focus.com/bid/10970 . SQL injection attacks are discussed in detail in Chapter 16.

Tampering with HTTP Headers

Cookies are one piece of data passed in HTTP headers, but several other headers are important to test. These include the following:

User-Agent Contains information about the browser and operating system the user is using to connect to the Web site.
Referer Contains the URL of the page that referred to the current URL.
Accept-Language The language in which the client would prefer the server send the response.

Modifying headers can be done easily by using an HTTP proxy. For example, a bug hunter known as Carbonize found that he could perform a script injection attack against Advanced Guestbook ( http://proxy2.de/scripts.php ) by modifying the User-Agent header sent to the Web server to include HTML script tags. For more information about this bug, see http://www.securityfocus.com/bid/14391 . Script injection attacks are discussed in detail in Chapter10.

Fuzz testing

Although testing best practices require the tester to understand how an application works and build test cases that apply specifically to the scenario being tested , you can often find bugs that crash or hang software by sending random data to the application. This process of sending random data to test security of an application is referred to as fuzzing or fuzz testing. The big advantage of this type of testing is that it can easily be automated and requires little understanding of how the target software works.

The automation used to send the random data is commonly referred to as a fuzzer. Sending fuzzed data in places where the attacker can control the data often uncovers security bugs. The most common set of bugs discovered by fuzzing are denial of service and buffer overflows. In 1990, Barton P. Miller, Lars Fredriksen, and Bryan So published a paper titled An Empirical Study of the Reliability of UNIX Utilities ( http://www.cs.wisc.edu/~bart/fuzz/fuzz.html ). In this paper, the authors describe how they were able to crash more than 25 percent of the programs they tested by sending random input.

There are two levels of fuzzing: dumb fuzzing and smart fuzzing. Sending truly random data, known as dumb fuzzing , often doesn t yield great results. If the code being fuzzed requires data to be in a certain format but the fuzzer doesn t create data in that format, most of the fuzzed data will be rejected by the application. For example, a program might accept input only for a path that begins with http:// . If the fuzzer produces totally random data that rarely begins with http:// , most data will be rejected right away. Slightly better- targeted fuzzers can send random data that will pass the first layer of validation performed by the target application. The more knowledge the fuzzer has of the data format, the more intelligent it can be at creating fuzzed data. These more intelligent fuzzers are known as smart fuzzers . Within the smart fuzzer category there is a wide spectrum of the level of intelligence used in creating the fuzzed data.

Several fuzzers are publicly available, including the following:

iDefense File Fuzzers The security company iDefense has three different fuzzers available for free download at http://labs.idefense.com . These fuzzers modify input files, launch the application that handles the input file, and detect exceptions.
SPIKE Dave Aitel of Immunity, Inc., has created a good framework for network fuzzing. His fuzzer is freely available at http://www.immunitysec.com/resources-freesoftware.shtml .
Peach Michael Eddington created cross-platform fuzzing framework written in Python. For more information, see http://peachfuzz.sourceforge.net/ .
Hailstorm Cenzic, another security company, produced a commercially available network fuzzer. More information about the fuzzer is available on the company Web site ( http://www.cenzic.com ).

For more information about additional fuzzers, see the article posted by Jack Koziol titled Fuzzers ”The Ultimate List at http://www.infosecinstitute.com/blog/2005/12/fuzzers-ultimate-list.html .