Several security issues are common in most Web applications because of inherent characteristics of HTTP and the Web environment. The following sections cover some general concerns you should be cognizant of when auditing Web code. Client VisibilityKeep in mind that all data provided to the client is in a single trust domain, meaning users have total visibility into the client side of the Web application. Attackers can easily view the generated HTML for each transaction as well as other contents of all HTTP transactions, which leads to the following security consequences:
Auditing Tip Examine all exposed static HTML and the contents of dynamically generated HTML to make sure nothing that could facilitate an attack is exposed unnecessarily. You should do your best to ensure that information isn't exposed unnecessarily, but at the same time, look out for security mechanisms that rely on obscurity because they are prone to fail in the Web environment. Client ControlAt any point, client users can construct completely arbitrary requests as they see fit, providing any combination of parameters, cookies, and request headers. Constructing these requests isn't hard and can be done by unsophisticated attackers with tools as simple as a text editor and a Web browser. In addition, several programs act as Web proxies and allow users to intercept and modify requests while they are in transit, making this easy task even simpler. The impact of this flexibility is that the server-side processing must be robust and capable of handling every possible combination and permutation of potential inputs. Variables can effectively contain anything or even be missing, and page requests can come in any order. Web application developers can't rely on the integrity of any client-supplied information. Keep the following points in mind:
Auditing Tip Look at each page of a Web application as though it exists in a vacuum. Consider every possible combination of inputs, and look for ways to create a situation the developer didn't intend. Determine if any of these unanticipated situations cause a page use the input without first validating it. Page FlowA page flow is the progression through Web pages that a users makes when interacting with a Web application. For example, in a Web application that allows you to transfer money from one account to another, the page flow might look something like Figure 17-5. Figure 17-5. Simple page flowA user would first browse to the TRansfer_start.php page, then select the source and destination accounts, enter the amount of money to transfer, and click Transfer Money. This takes the user to TRansfer_confirm.php, which provides an opportunity to review the decision, and then click to confirm the transfer. This would then take the user to the dotransfer.php page, which would actually perform the money transfer and display the transaction reference numbers. A common mistake in Web applications is to assume that attackers will request pages in a certain order. Because the client controls all requests it makes, it's entirely possible for the client to perform actions out of sequence. In some situations, this out-of-order sequence can allow attackers to bypass certain security measures and potentially exploit a system. For example, in the preceding page flow, the transfer_confirm.php page is responsible for validating that the source account entered in the transfer_start.php page actually belongs to the user. If an attacker goes straight to the dotransfer.php page, it's possible to bypass this check and potentially transfer money from an account the attacker isn't authorized to use. If the attacker did things only in the order developers intended, this couldn't happen because the transfer_confirm.php page would block the attack. Another page-flow related vulnerability can occur if an application makes an assumption about a variable or an object that a user doesn't have direct access to. For example, say an application places user's account number in the session after a successful login. All future pages in the application implicitly trust the account number's validity and use it to retrieve user information. There should be no possible way that normal use of the site through normal page flow could lead to a bad number getting in the session. However, if attackers can find a page they could call out of sequence, they could change this number in the session. Then they could potentially circumvent security controls and access other customer accounts. Note that this out-of-sequence page need change an account number for only a brief window of time, as attackers could use a second browser or second client with the same session to try to exploit the window. For another example of a page flow problem, say you have a page that only certain types of users are allowed to use. This page performs an authorization check that users must pass. It also makes use of a subsequent page that does more processing but doesn't contain the authorization check. Attackers who wouldn't be allowed to go to the first page could go straight to the second page and perform the unauthorized action. Auditing Tip Always consider what can happen if attackers visit the pages of a Web application in an order the developer didn't intend. Can you bypass certain security checks by skipping past intermediate verification pages to the functionality that actually performs the processing? Can you take advantage of any race conditions or cause unanticipated results by visiting pages that use session data out of order? Does any page trust the validity of an information user's control? SessionsAs discussed previously, sessions are collections of data stored on the server and tied to a particular user. They are typically created when users log in and then destroyed when users finish using the application. The following sections discuss some issues related to sessions. Session UseDuring a review, you should try to find every location where each session variable is manipulated. For every security-related session variable, try to brainstorm a technique for bypassing its associated security controls and checks. One thing to look for is inconsistent security checks. If a particular session variable is set in several places, you should ensure that each one does the same validation before manipulating the session. If one location is more permissive than others, you might be able to use that to your advantage when constructing an attack. You should also look for different places in the same Web application that use a session variable for different purposes. For example, the following PHP code is used to display details of an account: # display.php if ($_POST["action"]=="display") { display_account($_SESSION["account"]); } else if ($_POST["action"]=="select") { if (is_my_account($_POST["account"])) { $_SESSION["account"]=$_POST["account"]; display_menu(); } else display_error(); } First, the user goes to a page to select which account to view. If the user selects a valid account, the account variable in the session is set to reflect that valid account, and the user is presented a menu page with the option of displaying more information on that account. If the user selects an invalid account, an error page is returned, and the session isn't updated. Looking at this page in a vacuum, there's no way to get an account in the session variable account so that you can display other users' account information. However, this excerpt from the same application does present an opportunity for mischief: #transfer.php if ($_POST["action"]=="start_transfer") { $_SESSION["account"]=$_POST["destination_account"]; $_SESSION["account2"]=$_POST["source_account"]; $_SESSION["amount"]=$_POST["amount"]; display_confirm_page(); } else if ($_POST["action"]=="confirm_transfer") { $src = $_SESSION["account"]; $dst = $_SESSION["account2"]; $amount = $_SESSION["amount"]; if (valid_transfer($src, $dst, $amount)) do_transfer($src, $dst, $amount); else display_error_page(); } This code is from a page created for handling transfers from one account to another, and it also makes use of the session. When the user elects to start a transaction, the preceding code stores the destination account, the target account, and the amount of the transfer in the session. It then displays a confirmation page that summarizes the transaction user is about to attempt. If the user agrees to the transaction, the values are pulled out of the session and then validated. If they are legitimate values, the transfer is carried out. The security vulnerability is that both pages make use of the session variable account, but they use it for different purposes, and different security controls surround each use. If an attacker goes to transfer.php first and specifies an action of start_transfer and the account number of a victim in the POST parameter destination_account, the session variable account contains that victim's account number. The attacker could then go to display.php and submit an action of display, and the display.php code would trust the session variable account and display the details of the victim's account to the attacker. Another problem to look out for is inconsistent error behavior. If an application places a value in a session, and then fails because of an error condition, the value might still be left in the session and could be used through other Web requests. For example, say the code for display.php looks like this: # display.php if ($_POST["action"]=="display") { display_account($_SESSION["account"]); } else if ($_POST["action"]=="select") { $_SESSION["account"]=$_POST["account"]; if (is_my_account($_POST["account"])) display_menu(); else display_error(); } The developer made the mistake of updating the session variable account even if the account doesn't belong to the user. The Web site displays an error message indicating that the account isn't valid, but if an attacker proceeds to submit an action of display to the same page, the response will return the details of the victim's account. Note Study each session variable, and determine where it's manipulated and the security checks for each of its manipulations. Try to brainstorm a way to evade security checks and get your own values in the session variable at a useful time. Session handling vulnerabilities also occur when an attacker can supply a valid session ID to a victim, granting access to the victim's session. This is known as a session fixation attack and it relies on an implementation that does not issue a new session key after a successful login. An attacker can exploit this vulnerability by sending the victim a link with the session ID embedded in the URL, as shown: http://test.com/login?sessionid=A1C472BFF2340B10237E18D38602C346 Clicking through this link will bring the victim to a login screen. If the session code accepts the embedded key, the victim will log in with a session key already known to the attacker. Some session implementations don't accept a key that was not supplied by the server, so the attacker may first need to obtain a key by browsing to the site. Session ManagementAs a security reviewer, seeing in-house code handling session management should give you pause. Robust session management has many facets that are very difficult to implement securely. You should budget extra time to review any custom session code. When you're assessing a custom session implementation, ask questions such as the following:
Session TokensAs discussed previously, many applications and Web frameworks use a session token to track state and uniquely identify a session. In a good implementation, these tokens are securely generated, long random numbers that prove effectively impossible to predict or reuse after expiration. If session tokens aren't generated by using a solid random number algorithm with enough entropy, the entire site's security can be jeopardized. The simplest, and least secure, scheme for generating session tokens is having a global session token and incrementing it each time a new session is created. With the proliferation of frameworks and languages that handle sessions, using incremental session tokens isn't common now, but they are used occasionally in custom session implementations. The impact is usually severe. If you log in to a site and are assigned the session token X, you know the next user to log in gets the session token X+1. You can then wait around a bit and hijack the next user's session after authentication by submitting the predicted next session token. Code auditors can easily recognize this scheme by observing the source code or monitoring the session tokens the Web site produces. People have come up with a vast number of schemes to generate session tokens. The worst schemes, and the ones to watch for, use easily recognizable and easily predictable information to form the token. If a site uses an e-mail address and a username, or an IP address and a username, as the session token, after you've observed your own token, you're in a good position to start guessing other users' tokens. For example, you could easily brute-force a session token based on concatenating the time of day in seconds and the user's account number. Attackers could try tens of thousands of accounts while probing for a time period during which the site is normally under heavy traffic and has many active users. Keep in mind that attackers can usually brute-force potential session tokens at extremely high speeds because of the stateless nature of HTTP. Also, attackers might be content with getting access to any session at all, not just a particular user they're targeting. A given scheme might make it hard for attackers to access a particular victim's account, but to be safe, the scheme needs to make it difficult for attackers to access any account with a broad-based attack that simply looks for the first success. If you have the time and resources, try to launch one of these attacks yourself by creating small testing scripts that search for valid tokens in a tight loop. Ideally, the session token needs to have a component that's random, unique, and unpredictable. This random component also needs to be large enough that attackers can't simply try a high percentage of the possible combinations in a reasonable amount of time. This random component of the session token should be difficult to predict. The linear congruential generator (LCG) random number generators in most general-purpose programming libraries aren't appropriate for this purpose. For example, the numbers generated by the rand() family of functions on a typical UNIX standard library and the Java.util.Random class can be predicted easily, as they use the last result of the random operation as the seed for the next random operation. You might see systems that use sources of data that aren't secure but do transformations on it so that ascertaining how tokens are constructed would be difficult. For example, take a system that uses the time of day concatenated with the user's account number and a random number from a LCG, but MD5 hashes the whole string. You would have a hard time figuring out how to brute-force those session tokens from a black-box perspective, but it's not impossible. Attackers with enough patience and intuition could probably figure this scheme out eventually. Ultimately, although these schemes might be reasonably secure against external attackers, they aren't worth the potential risk of the obscurity being breached, especially when making the system demonstrably secure is simple. If a system is based on a cryptographic algorithm that requires a seed or key, you should evaluate the possibility of an attacker performing an offline attack and discovering the seed or key. For example, if the system generates a secure hash of the time of day combined with a global sequence number for each user, that's a weak seed that can be brute-forced. Even with limited inside knowledge, an offline search could be performed until the attacker figured out the algorithm for constructing the seed. This issue is explored more in Chapter 18, but for the Web environment, you should keep the following points in mind:
Note Try to determine how session tokens are generated, and attempt to make sure that predicting or guessing a future session token is difficult. If you have the time and resources, it can be worth reverse-engineering or auditing any infrastructure component that handles sessions on behalf of the application, as they aren't always as secure as the developers would hope. Session Token TransmissionAnother session security concern is secure transmission of the session token. Watch for these issues when you're auditing a Web application:
AuthenticationKeep the following areas of inquiry in mind while examining a Web application's authentication mechanisms:
Auditing Tip First, focus on content that's available without any kind of authentication because this code is most exposed to Internet-based attackers. Then study the authentication system in depth, looking for any kind of issue that lets you access content without valid credentials. Authorization and Access ControlAuthorization refers to the application components responsible for ensuring that authenticated users have access to only resources and actions to which they're entitled. To assess a system's authorization implementation, you want to determine which privilege levels the system defines and what the possible user roles are. Then you want to figure out what resources each privilege level can access and make sure everything is consistent. Mentally assume the role of each type of user, and then study the code and the available content to determine which resources you can access and whether your access is appropriate. Authorization can be performed in a centralized fashion, with all Web components sharing code that performs permission checks. It can also be decentralized, with each request handler being responsible for making sure the user is authorized to proceed. In either style, it's rare for authorization to be applied consistently in every situation, as it takes just one oversight, such as the following points, to miss something:
Auditing Tip When reviewing authorization, you need to ensure that it's enforced consistently throughout the application. Do this by enumerating all privilege levels, user roles, and privileges in use. Encryption and SSL/TLSSSL has been mentioned previously in this book, and this section offers a brief recap. Secure Sockets Layer/Transport Layer Security (SSL/TLS) is an application-layer protocol for securing communications between two clients over a socket connection. It uses certificates to authenticate the connection endpoints and encrypts communications over the socket. SSL allows both connection endpoints to be authenticated via the certificate, although most Web applications only authenticate the server to the client. TLS is an addition to SSL that primarily allows an active plain-text connection to be upgraded to an SSL connection. Authentication in SSL is handled entirely by certificates. Each endpoint contains a list of certificate authorities (CAs) it trusts. Any certificate presented to a client is checked to see whether it's valid and has been signed by one of these authorities. CAs are most apparent to Web users when they see an error message displayed while attempting to connect to an SSL Web site. The site's certificate might be expired; the domain name might not match the certificate exactly (such as www.neohapsis.com versus neohapsis.com); or the signing CA might not be trusted by the client. SSL is typically used when a server authenticates itself to a client by proving it corresponds to the domain name being requested. Additionally, registering a certificate with a trusted CA generates a paper trail and varying degrees of authentication, depending on the type of certificate. It's intended to make Web surfers feel reasonably assured that they're interacting with the correct Web site and their communications (such as personal or financial information) can't be intercepted by third parties. A less typical application of SSL communication is to validate the client to the server. However, this use is growing more common in Web services, in which both the client and server are automated systems. Both ends of the connection validate each other in essentially the same manner described previously. This technique is also useful for validating user connections to extremely critical sites, as it reduces most of the noise from worms and automated probes. Keep the following points in mind when assessing SSL use in Web applications:
Phishing and ImpersonationAttackers tend to follow the path of least resistance. More technical attackers might focus on finding intricate vulnerabilities in a Web application through focused black box testing, but a newer class of Internet criminal has adopted a simpler approach: the phishing attack. For each Web site criminals would like to attack, they construct a fake Web site resembling their target. They then attempt to lure users to that Web site through official-looking e-mails sent to possible users. If users of the site click the e-mail and end up at the faked Web site, they might have difficulty distinguishing it from the real site. Consequently, users can end up being tricked into surrendering credentials or important information that attackers can use at the real site for fraudulent purposes. Phishing attacks can leverage any of a number of vulnerabilities. Cross-site scripting and cross-site tracing are often useful in these attacks, although there are more subtle, obscure ways of phishing. For example, in February 2005, Eric Johanson reported a vulnerability in Mozilla's International Domain Name (IDN) handling (archived at www.mozilla.org/security/announce/2005/mfsa2005-29.html). The core of the vulnerability is that attackers can register a domain name and obtain a trusted SSL certificate for two hostnames that look identical but are actually composed of different characters. This is an example of the Unicode homographic attack described in Chapter 8. The attack involved registering the domain name www.xnpypal-4ve.com, which is rendered in an IDN-compliant browser as paypal.com. This method of encoding non-ASCII domain names is called punycode, and it's identified by any domain name component beginning with an "xn" string. In this attack, the punycode representation inserts a Cyrillic character that's rendered as the first a in paypal.com. The "-4ve" portion of the name contains the encoded character insertion information. This attack resulted in a domain name, an SSL certificate, and a Web site that was almost indistinguishable from the real Paypal site. In response, IDN-compliant browsers changed their handling of these names. They now inform users that the name is an IDN representation, and some browsers disable IDN by default. Of course, attackers still have numerous ways to trick users into falling for phishing attacks. As a reviewer, you need to be on the lookout for any application vulnerabilities that could simplify the phisher's job. |