State and HTTP Authentication


HTTP is a straightforward request and response protocol that's stateless by design. Web servers don't keep track of what a client has requested in the past, and they process each request in a vacuum, using only the information in the actual request header and body. Most Web applications, however, must be able to maintain state across separate HTTP requests. They need to remember information such as who has logged in successfully and which Web client goes with which bank account. Grafting state tracking on top of HTTP can be done in a few different ways, discussed in the following sections. Security vulnerabilities related to the underlying stateless nature of HTTP are quite prevalent in Web code, so it's worth spending time reviewing the basic concepts and issues of state tracking.

State

It's important to understand the distinction between a stateless system and a system that maintains state (that is, a stateful system). A stateful system has a memory; it keeps track of events as they occur and cares about the sequence of events. A stateless system has no such memory. In general, every time you provide the same event to a stateless system, you get the same result. This isn't true for stateful systems because the previous events you have supplied can affect the result.

A good example of state tracking can be found in firewall technology. Firewalls take packets off the network and decide whether each packet is safe. Safe packets are forwarded on to the protected network, and dangerous packets are rejected or ignored. A stateless firewall makes its decision by looking at each packet in isolation. A stateful firewall, however, has a memory of past packets that it uses to model active connections on the network. When a stateful firewall analyzes a packet, it can determine whether that packet belongs to a legitimate connection it has witnessed previously. Stateless firewalls can base their decisions only on the contents of the packet they intercepted and analyzed in a vacuum. Stateful firewalls are more complex and error prone, but they are also more powerful and potentially let through fewer dangerous packets.


Overview

Even the simplest business Web sites require the Web application to maintain some form of state across HTTP requests. To explore some state-tracking concepts, you'll use a simple example of a Web application: a Web site for an online financial service. Customers should be able to log in, see their balance, and optionally see their secret PIN. A plan for the site is laid out in Figure 17-2.

Figure 17-2. Simple Web application


The login page is the first page users of the site see. It's responsible for two tasks: displaying the login form and handling authentication of users. When users come to the login page for the first time, the code for the page displays the login form. When users fill in the login form and submit it, the login page attempts to validate the username and password entered in the form. If the credentials are valid, the login page forwards users to the main page. Otherwise, it displays an error.

The main page is responsible for displaying users' balances and presenting a menu of options. It needs to determine the identity of the user requesting the page so that it can retrieve the correct account balance information, and it needs to make sure the user has logged in successfully.

The secret page is responsible for displaying users' secret PINs. It also needs some way of identifying users so that it can look up the correct secret PIN. After all, you certainly don't want the application to divulge secret PINs to the wrong users.

You can isolate two pieces of state information you need to track in this simple application:

  • Whether the user is authenticated The main page and the secret page shouldn't be available to unauthenticated users. They should have to log in successfully on the login page first.

  • The user tied to the Web client making the request Both the main page and the secret page need to know which account they should look up for their information.

Because Web servers don't have a memory and don't keep state, you need some way to have the Web application remember this information after users log in successfully. The following sections describe possible solutions.

Client IP Addresses

Web applications can ascertain several details about a client request from the Web server, which they can use to try to identify and track users. The client IP address is one of the few identifying features the client shouldn't be able to spoof or control, so it's sometimes used to maintain state.

In your application, you could use this information by recording clients' source IP addresses when they log in successfully. You could make an entry in a file or database that contains the client's IP address and associated account number and solve both state requirements. If you need to verify whether the user is authenticated in the main page or the secret page, you just check to see whether the client's IP address is in the list of authenticated clients. If it matches, you can pull the associated account from the list and look up the user's details.

This scheme might work well for your simple site, but you could definitely run into problems. The biggest issue is that if the user is behind a Web proxy, Web cache, or firewall, you get a source IP address that's shared with everyone else at that user's organization or ISP. Therefore, if users went to the main page or secret page at an opportune time, they might be able to retrieve sensitive information from another user's account.

If the client is behind a load-balancing proxy or a firewall device that uses multiple IP addresses for its Network Address Translation (NAT) range, you could also run in to the problem of users' IP addresses changing in the middle of their sessions. If this happens, users would experience intermittent failures when trying to use your Web site. Also, if users have logged in from a shared or public machine, a miscreant could come along after users have closed their browsers and go straight to the secret page with a new browser.

All in all, these problems can be major drawbacks. You could certainly try to resolve potential conflicts by recording other facts about clients, such as the User Agent string, but this scheme is a very poor choice in most situations.

Auditing Tip

Tracking state based on client IP addresses is inappropriate in most situations, as the Internet is filled to capacity with corporate clients going though NAT devices and sharing the same source IP. Also, you might face clients with changing source IPs if they come from a large ISP that uses an array of proxies, such as AOL. Finally, there is always the possibility of spoofing attacks that allow IP address impersonation.

There are better ways of tracking state, as you see in the following sections. As a reviewer, you should look out for any kind of state-tracking mechanism that relies solely on client IPs.


Referer Request Header

One of the HTTP request header fields is Referer, which the Web browser uses to tell the server which URL referred the browser to its current request. For example, if you're at the page http://www.aw-bc.com/ and click a link to http://www.neohapsis.com/, your Web browser issues the following request to the www.neohapsis.com server:

GET / HTTP/1.0 Host: www.neohapsis.com Referer: http://www.aw-bc.com/


Web developers sometimes use the Referer field to try to enforce a certain page flow order by ensuring that users come only from valid pages. However, this method of enforcement is very easy to circumvent.

Say that in your sample application, you track users by IP address. As part of your security controls, but you also want to make sure users get to the secret page only by coming from the main page. This way, attackers can't wait for someone else in the organization to log in and then go straight to the secret page. You decide to add some code to make sure users can get to the main page only by coming from the login or secret page. This approach might seem to prevent pages from giving out PINs and account balances to unauthenticated users. As you might suspect, however, it's fundamentally flawed because the Referer header is a client request parameter, and clients can set it to whatever they like! For example, here's what happens when you enter a request manually with the openssl s_client utility:

test # openssl s_client -connect test.test.com:443 GET /test/secret HTTP/1.0 HTTP/1.1 200 OK Date: Sat, 21 Aug 2006 09:17:50 GMT Server: Apache Accept-Ranges: bytes X-Powered-By: PHP/4.3.0 Connection: close Content-Type: text/html; charset=ISO-8859-1 invalid request


You get an "invalid request" message, indicating that you failed the Referer check. Now put the right Referer in there to placate that check:

test # openssl s_client -connect test.test.com:443 GET /test/secret HTTP/1.0 Referer: https://test.test.com/test/main HTTP/1.1 200 OK Date: Sat, 21 Aug 2006 09:23:37 GMT Server: Apache Accept-Ranges: bytes X-Powered-By: PHP/4.3.0 Connection: close Content-Type: text/html; charset=ISO-8859-1 <html> <head><title>Secret!</title></head> <body> <p>The secret PIN is zozopo.</p> <p>Click <a href="main">here</a> to go back.</p> </body> </html>


Oops! The forged Referer header satisfies the check and successfully displays the secret page. So, using a Referer header might buy you a modicum of obscurity, but it doesn't do much to provide any real security.

Note

The Referer field does have some security value for preventing cross-site reference forgery (XSRF) attacks. Jesse Burns of Information security partners published an excellent paper on this attack type, available at www.isecpartners.com/documents/XSRF_Paper.pdf.


Embedding State in HTML and URLs

The essential trick to maintaining state in HTTP is feeding information to the client that you expect the client to include in every request. This way, the client provides all the information you need to process the request, or it provides a piece of information you can use to retrieve the other needed information from a separate source.

In the sample application, if you can come up with a way to always have clients provide the information the server needs to process requests, you have a solution that meets your needs for state tracking.

In the main and secret pages, you need to know that clients have logged in successfully, and you need to know who clients are so that you can retrieve their account information. First, examine the second half of the problemidentifying users. If you could have clients send usernames along with every request to the main and secret pages, you could determine who the users are and pull the correct information.

Because you control every link to the main and secret pages, and every link is in HTML written by the Web application code, you can simply have every link contain a parameter that identifies users. For this method to work, you can't miss any path to the main or secret pages, or the username isn't sent and the page can't process the results. You can pass this information in a few ways, but the most popular methods are hidden fields in HTML forms and query strings.

HTML forms enable you to have hidden fields, which are variables set in the form but not visible to users in their Web browsers. In a form where you want to add a hidden username, you just need to add a line like this:

<input type="hidden" name="username" value="jimbo">


Hidden fields work well for forms, but this application mainly uses hyperlinks to get from one page to the next. You could rewrite the application to use forms, or you could pass along the state information as part of a query string (or path information). For example, in the main page, instead of printing this line:

<p>Click <a href="secret">here</a> to see your secret PIN.</p>


You could print this line:

<p>Click <a href="secret?username=jimbo">here</a> to see your secret PIN.</p>


If you rewrite the application to pass the username along with every request, the application would certainly be functional. However, it wouldn't be secure because attackers could just go straight to the main or secret page and provide the name of the person whose account they wanted to view.

Auditing Tip

Although this sample application might seem very contrived, it is actually representative of flaws that are quite pervasive throughout modern Web applications. You want to look for two patterns when reviewing Web applications:

  1. The Web application takes a piece of input from the user, validates it, and then writes it to an HTML page so that the input is sent to the next page. Web developers often forget to validate the piece of information in the next page, as they don't expect users to change it between requests. For example, say a Web page takes an account number from the user and validates it as belonging to that user. It then writes this account number as a parameter to a balance inquiry link the user can click. If the balance inquiry page doesn't do the same validation of the account number, the user can just change it and retrieve account information for other users.

  2. The Web application puts a piece of information on an HTML page that isn't visible to users. This information is provided to help the Web server perform the next stage of processing, but the developer doesn't consider the consequences of users modifying the data. For example, say a Web page receives a user's customer service complaint and creates a form that mails the information to the company's help desk when the user clicks Submit. If the application places e-mail addresses in the form to tell the mailing script where to send the e-mail, users could change the e-mail addresses and appear to be sending e-mail from official company servers.


To secure this system, you need to pass something with all requests that attackers would have a hard time guessing or faking. You could definitely improve on this system until you have a workable solution. For example, you could generate a large random number at login and store it in a database somewhere. To fake logged-in status, attackers would have to guess that random number, which could be difficult. For now, however, take a brief look at HTTP authentication in the next section.

HTTP Authentication

HTTP has built-in support for authenticating users through a generic challenge/response mechanism. Many enterprise sites don't use this protocol support; instead, they opt to implement their own authentication schemes or, more often, use an authentication framework provided by their infrastructure/middleware components. However, you still encounter HTTP authentication in real-world applications and Web sites, although it's more often used to protect secondary content, such as administrative interfaces, or for less enterprise-oriented sites, such as Web forums.

The most widely supported authentication scheme is Basic Authentication. Basically, a username and password is collected from the user and base64-encoded. The base64 string is sent over the network to the server, which decodes it and compares it with its authentication database. This scheme has myriad security vulnerabilities, with the most significant problem being that the username and password are effectively sent over the network in clear text. Therefore, this method can be quite risky for authentication over clear-text HTTP. Its security properties are an order of magnitude better when it's used over SSL, but it's still recommended with trepidation. If the browser is somehow tricked into authenticating with cached credentials over a clear-text connection, the user's password could be seized.

The other authentication scheme specified in the HTTP RFCs is Digest Authentication, a challenge/response authentication protocol. The level of security it provides, however, depends quite a bit on the version and options used. The original pre-HTTP/1.1 specification of Digest Authentication was designed so that the HTTP server is still completely stateless. Therefore, the HTTP server isn't required to remember challenges it presents to the client, and the protocol is susceptible to considerable replay attacks. The HTTP/1.1 specifications have the option of a form of stateful tracking of challenges issued by the server, which eliminates the straightforward replay attacks. Its security properties when used with SSL are arguably quite good when either version is used. However, Digest Authentication is not supported on all platforms, and it also requires that passwords be stored in plaintext at the server. As such, Digest Authentication is not commonly seen in web applications.

There are also proprietary authentication schemes implemented over HTTP, particularly for Microsoft technologies. For example, IIS supports Integrated Windows Authentication, which uses Kerberos or Windows NT Lan Manager (NTLM) for authentication but works only over SSL connections. There's also the possibility of .NET Passport authentication support, which ties into Microsoft's global Passport service.

Auditing Tip

Weaknesses in the HTTP authentication protocol can prove useful for attackers. It's a fairly light protocol, so it is possible to perform brute-force login attempts at a rapid pace. HTTP authentication mechanisms often don't do account lockouts, especially when they are authenticating against flat files or local stores maintained by the Web server. In addition, certain accounts are exempt from lockout and can be brute-forced through exposed authentication interfaces. For example, NT's administrator account is immune from lockout, so an exposed Integrated Windows Authentication service could be leveraged to launch a high-speed password guessing attack.

You can find several tools on the Internet to help you launch a brute-force attack against HTTP authentication. Check the tools sections at www.securityfocus.com and www.packetstormsecurity.org.


To enable HTTP-supported authentication, you must configure your Web server to protect certain content in your Web tree. When a Web browser attempts to request protected content for the first time, the server returns a 401 message, which indicates the access request was unauthorized. This 401 response includes a WWW-Authenticate header field that informs the client which authentication methods are supported. This header field also contains challenges for any supported authentication mechanisms that use a challenge/response protocol.

The Web browser then presents the user with an authentication dialog. It resubmits the original request to the Web server, but this time it includes an Authorization header containing a response appropriate for the selected authentication method. If the authentication information is invalid, the server again responds with a 401 message, and the WWW-Authenticate header field has new challenges. The behavior that makes this system come together is that if a browser is successfully authenticated to a protected resource, it continues to send the Authorization header with every subsequent request to that resource and anything below that resource in the Web hierarchy.

Note that the server is still stateless, and the client Web browser is what makes the user experience seem fluid. The server always responds to an incorrect or missing Authorization header with a 401 message. It's up to the client to attempt to provide a correct Authorization header by querying the user and retrying the request. If the client does authenticate successfully, protected dynamic applications are able to retrieve the username from the Web server, which they can use for tracking state if necessary.

If you want to modify the sample application so that it's protected by HTTP authentication, first you need to configure the Web server to guard the application's Web pages. For example, with Apache, you place an .htaccess file in the same directory as the Web application code:

AuthUserFile /scan/apache/htdocs/text/.htpasswd AuthGroupFile /dev/null AuthName HappyTown AuthType Basic <Limit GET POST> require user jim </Limit>


You should get rid of the login page, as the Web server and Web browser would work together to manage collection of usernames and passwords and perform authentication. You could simply rewrite the main and secret pages so that they check for the server variable REMOTE_USER, which is set to the client's username if the client authenticates successfully.

Auditing Hidden Fields

In the early days of Web development, authentication was usually handled by HTTP and the Web server, and state maintenance was primarily done through hidden form fields and query string parameters. Many programmers who are developing today's n-tier distributed enterprise Web applications are the same developers who were cranking out Perl and CGI Web applications back then. In many large Web applications, you can find an occasional throwback to the simpler days of Web coding, probably in places where the developer felt rushed or didn't have time to go back and refactor the code.

A reasonable rule of thumb these days is that state maintenance done with hidden form fields is appropriate only for information that's temporarily collected before it's validated and processed. For example, if a survey requires users to fill out three pages of forms, you might expect to see values from the first page as hidden parameters on the second and third pages.

As a code reviewer, you should watch for data that's propagated via hidden fields after it has been validated or data that's placed into hidden fields to facilitate the Web server's future processing. In both cases, developers often don't consider the impact of users changing the data after the initial submission.


Cookies

Cookies are a generic HTTP mechanism for storing small pieces of information on a client's Web browser. After you store a cookie on a Web browser, every subsequent request the browser makes to your Web application includes that cookie. Therefore, cookies are ideal for tracking clients and maintaining state across requests. Most enterprise Web applications and Web-oriented programming frameworks build state management entirely around cookies.

To set a cookie, the Web application instructs the Web server to send a HTTP response header named Set-Cookie. It looks like this:

Set-Cookie: NAME=VALUE; expires=DATE; path=PATH; domain=DOMAIN; secure


The first part of the Set-Cookie header is the actual content of the cookie, which consists of a single cookie name and a single cookie value. They are encoded with the same style of hexadecimal encoding used for GET and POST parameters. If you want to set multiple variables, you actually set multiple cookies instead of using something like the & character. All relevant cookies are sent to the Web server, as explained later in this section.

The expires tag lets the server specify an expiration date/time for the cookie. After the specified time, the browser stops sending the cookie and deletes it. This tag is optional. A cookie with the expires tag is known as a persistent cookie, and a cookie without the tag is a nonpersistent cookie. Nonpersistent cookies are temporary in nature; they exist only in the browser's memory and are discarded when the browser is closed. Persistent cookies have more permanence, as they are stored on the client's file system by the Web browser and persist when the browser is closed.

The path and domain tags help the browser know when to send the cookie. Every time a browser makes a Web request, it searches through its list of cookies to see whether any that need to be sent. First, it checks the domain name of the Web server against the domains specified in its list of cookies. This check is a substring search based on the tail of the domain name, so a cookie set with a domain of .test.com is sent to the servers www.test.com, www2.test.com, and this.is.a.test.com, for example.

If the browser finds any cookies matching the specified domain, it then checks the path parameter. The path of the Web request is checked against the path specified when the cookie was set. This check is also a substring search, but it works from the head of the path. So a path=/ tag in the Set-Cookie header causes the cookie to match every request, as every Web request starts with a / character. A tag such as path=/test causes the cookie to be sent to every Web request starting with /test, such as /test/, /test/index.html, or /test/test2/test.php.

Cookies can also be marked secure or nonsecure with the optional secure tag. A secure cookie is sent only over HTTPS, whereas a nonsecure cookie is sent over both HTTP and HTTPS.

For each Web request, the browser selects all cookies that seem appropriate by evaluating the Web request against the domain and path attributes of the cookies in its internal store. It then concatenates all matching cookies into a single request header field, which looks like this:

Cookie: NAME1=VALUE1; NAME2=VALUE2; NAME3=VALUE3


In your sample Web application, you could make use of cookies to handle tracking user state. To do this, you add code to set a cookie if the user logs in successfully, and then you add code to check for the cookie and pull the username in the main and secret pages. If you compare this approach to the solution of rewriting every page request to contain a hidden field, you can see that the cookie solution is much simpler and saves you a lot of trouble. Now imagine a typical Web site with at least 30 different pages and a few hundred potential page traversals, and you can see that the cookie approach is an order of magnitude simpler than other state-tracking schemes.

Auditing Tip

When you review a Web site, you should pay attention to how it uses cookies. They can be easy to ignore because they are in the HTTP request and response headers, not in the HTML (usually), but they should be reviewed with the same intensity you devote to GET and POST parameters.

You can get access to cookies with certain browser extensions or by using an intercepting Web proxy tool, such as Paros (www.parosproxy.org) or SPIKE Proxy (www.immunitysec.com). Make sure cookies are marked secure for sites that use SSL. This helps mitigate the risk of the cookie ever being transmitted in clear text because of deliberate attacks, such as cross-site scripting, or unintentional configuration and programming mistakes and browser bugs.


Sessions

You have surveyed all the technology building blocks a Web application can use to track state. You can pay attention to inherent attributes of the HTTP request, such as the client IP address or the Referer tag. You can embed information the application needs in dynamically created HTML, in hidden form fields, or in URIs by using path information and query strings. You can rely on HTTP authentication mechanisms to have the Web server determine who the authenticated user is for every request. Finally, you can use cookies to store information on the Web browser that are transmitted by the browser with every subsequent request.

In the early days of dynamic Web programming, Web developers created a useful abstraction for tracking state known as a session. A session is basically a data structure that serves as a container for data associated with a Web client. Sessions are data stores that are maintained on the server in memory, on disk, in a database, or as component objects in an application server. A Web application stores data and objects in a session and retrieves them later through a simple API.

The session is tied to a user through the use of a session token, which is a unique identifier that the server can use as a unique key for accessing the session data structure. Session tokens are usually large random numbers created for users when they log in or make their first request to the Web site. Ideally, this token should be known only by the client, making it a secure mechanism for uniquely identifying a user.

The session system is supported by using one of the state-tracking mechanisms you examined earlier. The only information users need to send with every request is the session token, so it works well with multiple schemes. The most common implementation, however, is with cookies. When a user accesses a site, the Web server creates a session and sets a cookie containing the session token. Every subsequent request from that user includes the cookie containing the session token. Even though cookies are the most popular mechanism for session identification, session tokens may be passed in hidden form fields, in query string parameters, or in rare cases, as URI path components.

The beauty of the session abstraction is that after a session is established, the Web application code has a universal and simple mechanism for associating data with a specific user. Sessions are typically used in two different ways. First, they are used as a secure mechanism for storing state information that's globally useful to all pages in a Web application. For example, in your sample Web site, the login page could store the username of the user in the session after a successful login. The main and secret pages then only need to check the session to see whether that username has been set. There's no way a remote user could alter the session and add or change the username unless a vulnerability existed in the session management code or the Web application. In general, the session can be used as a safe place to store information you don't want the client to have direct access to.

Second, sessions are used to temporarily store information, in much the same way developers use hidden form fields. One page might take data from the user and validate it, and then instead of writing it to the HTML as hidden fields, the page stores it in the session. That way, developers could be sure the user couldn't tamper with the session contents, and the data in the session could be trusted for use in a subsequent page.

Sessions are usually provided by a Web framework or Web-oriented language, although they can be implemented by application developers. The details vary across different frameworks, but sessions are often created automatically the first time a client connects to a Web site. Languages such as PHP and frameworks such as ASP automatically include session support that's backed by cookies.

Note

Sessions are an important component of Web applications. You learn how to review them from a security perspective in "Problem Areas" later in this chapter.





The Art of Software Security Assessment. Identifying and Preventing Software Vulnerabilities
The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities
ISBN: 0321444426
EAN: 2147483647
Year: 2004
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net