Technology: Web App Security Scanners

If you're an IT admin tasked with managing security for a medium-to-large enterprise full of web apps, we don't have to sell you on the tremendous benefits of automation. We'll just cut right to the point and attempt to answer the $64,000 (at least) question, "Which web application security scanner is the best?"

After evaluating dozens of tools on the market, we settled on a sampling that we believe represents the best-of-breed automated web application security scanners. Table 13-1 lists the contestants that made the cut, along with their respective pricing as of March 2006.

Table 13-1: Web Application Security Scanners We Tested ( please contact vendor for custom/ volume pricing)
Tool	Pricing
Acunetix Enterprise Web Vulnerability Scanner 3.0	$4,995 (unlimited) + $999 maintenance agreement
Cenzic Hailstorm 3.0	$15,000 per year per application
Ecyware GreenBlue Inspector 1.5	$499
Syhunt Sandcat Suite 1.6.2.1	$1,899 for "Consultant/Floating" license + 20% of license fee for annual maintenance
SPI Dynamics WebInspect 5.8	$25,000 per user /entire network, $5,000 annual maintenance, $2,495 for Toolkit
Watchfire AppScan 6	$15,000 per year

We also performed some limited testing with some popular (and free) "security consultant toolbox" programs more suited to manual penetration testing, in order to provide a reference comparison:

N-Stalker NStealth Free Edition
Burp Suite 1.01
Paros Proxy 3.2.9
OWASP WebScarab v20052127
Nikto

Finally, we ran a source-code-analysis/fault-injection/web-scanning suite in parallel to confirm (or deny) some of the findings reported by the scanners, and to get an idea of how multifunction suites compared to purebred scanners. The tool we selected was Compuware DevPartner SecurityChecker 2.0.

The list above is not quite a comprehensive roundup of all web application security scanners. Unfortunately we were not able to review NTObjectives' NTOSpider due to technical issues during preliminary testing that could not be resolved in time for publication.

Another perceived omission might include generic network/host vulnerability scanning products like ISS, Foundstone, eEye Retina, Nessus (with web plug-ins), NGS Typhon III, and Qualys. Although many have added basic fault injection-style tests for web applications, our preliminary testing indicated that the current state of web functionality offered by these products was not comparable to the dedicated web application scanning tools we tested here.

Finally, we did not review web application security scanning services like those offered by WhiteHat Security (see "References and Further Reading" for a link). We kept our scope limited to an "apples-to-apples" comparison of off-the-shelf software this time around.

The Testbed

To create our testbed, we selected six off-the-shelf sample applications representing a broad range of application functionality types to benchmark both dynamic scanners ranging from traditional network vulnerability scanners to web applicationspecific fault-injection tools, and to test automated source code scanners. The test applications we initially selected were

OWASP/Foundstone SiteGenerator Beta 2
OWASP WebGoat
Foundstone Hacme Bank 2.15
Foundstone Hacme Bank Web Services
Foundstone Hacme Books 2.00

The time and difficulty of benchmarking all selected tools against these six applications quickly became problematic . A significant number of errors, performance issues, false positives, and false negatives led us to create two custom test applications, configured to represent common features of modern web applications, including common, real-world security weakness that we encounter frequently in our consulting work.

The application we called "FlashNavXSSGen" is a very simple application that represents the most rudimentary of Flash navigation menus, where links are passed into the SWF object as string variables stored as text in the web page. The menus lead to both static HTML pages, for purposes of testing authorization checks, and to dynamic ASP.NET pages coded to represent a common pattern of weakness that exists today in the wild in several commercial off-the-shelf (COTS) software packages.

Our second test application was a PostNuke 7.5based content-management and portal system representing the "cutting edge" of PHP security, including " anti-hacker ," "safe-HTML," and "IDS" features. We deployed PostNuke 7.5 with all security features turned on and default weaknesses included. We further tuned decoding and validation of several input parameters in select locations to ensure multiple XSS attacks of specific tag types and double-encoding types could be successful.

One of the core weaknesses in PostNuke (and most PHP portals) is significant lack of standardized output encoding that is safe for a browser user agent. PHP portals also often have significantly above-average attack surface to SQL Injection, due to the fact PHP as an implementation language lacks the ability to clearly specify a data/function boundary for things like SQL queries, and most defense relies upon escaping SQL injection.

Finally, our testing network was 100 Mbps switched, had no network bandwidth load, and the test machines consisted of new dual-core processor systems to control for any performance issues.

The Tests

The main focus of our testing was to determine where automation can provide reliable or enhanced analysis and which areas still require human eyeballs. To this end, we cooked up the following battery of tests based on what a common IT administrator would expect from these tools:

State: Must be able to log into the application and maintain session state.
Custom Rules: Must be able to distinguish one user's private contents from another.
Authorization: Must be able to distinguish unauthorized from authorized access.
XSS: Test application for vulnerability to XSS attacks of varying complexity.
Flash: Can the scanner detect abuse-able Macromedia Flash File Format (SWF) content embedded in the test app?
SQLi: Deduce if SQL injection is possible.
Logs: Review logs to ensure attacks/ abuses are properly logged.
Top 10: Verify that the "OWASP Top 10" issues have been tested for.
Reporting: Provide reporting capabilities with multiscan trend analysis.

We defined a simple pass/fail rating system based on these criteria. This testing was done with the perspective that most users of these tools use them as point-and-click scanners, a fact that we verified with multiple corporate users of the primary applications tested in this sampling.

To better understand the technical criteria by which we tested the automated analysis tools, we'll examine each one in more detail next .

State

This tests whether the scanner is able to log into the application via form-based authentication and maintain state throughout a session.

Custom Rules

This is one of the features that differentiates web application scanners from their not-so- distant cousins, network vulnerability scanners. Vulnerabilities discovered by a network scanner typically relate to a missing patch or misconfiguration, whether or not the host has access to sensitive information.

With application scanners, we also have to decide if Rob is allowed to see Sally's private content (in our testing, we referred to user content as " reports "). A scanner may operate impersonating Rob, and gain access to Sally's reports, but unless the reports have unique content that the scanner has a signature for, it's very difficult for a scanner to flag this as even a potential issue.

This category of tests was designed to see if the scanner is customizable enough to support this scenario. More specifically , we wanted to know if the scanners would permit creation of custom checks that could distinguish between Rob and Sally's reports.

Authorization

Can the scanner distinguish authorization between when it is acting as an authenticated and when it is acting as an unauthenticated entity when performing checks? Does it have default functions to automatically identify this, or does this require custom configuration?

Cross-site Scripting (XSS)

We wanted to evaluate scanner detection capabilities for the majority of known XSS attack types, from the obvious, to the subtle (bypassing weak input validation), to the complex (combination double-encoding attacks). Here are examples of these three types of XSS attack problems that we solved through manual analysis, and would like to solve through automation:

Obvious XSS Tests These tests were designed to find the simplest type of XSS attacks, where no input validation is performed at all, and any of the common XSS metacharacters can be injected directly into the application. We expected the scanners to perform basic XSS testing like the following:

 <script>alert('somethingclever');<script> <script>alert('somethingclever');<script>user@domain.site users@domain.site<script>alert('somethingclever');<script>

The more advanced scanners attempted tactics like replacing "user" and "site" with:

 <script>alert();<script>

Some tried further levels of escaping like '>, '>>, '), </textarea>, </xml>, and so on.

Subtle XSS Tests These tests were designed to find more subtle XSS variants, where weak or partial input validation is attempted. Our test for this consisted of a new user sign-up form input designed to take an e-mail address in the form user@domain.com. The form value is validated server-side by a sloppy regular expression (regex) validation string; if a legitimate e-mail address is not provided, the form error returns no data. The validation routine only validates that characters before the '@' symbol are alpha-numeric, and verifies that the string ends with a valid top-level domain suffix (e.g., .com, .net. .ie, etc.). The XSS attack string that works is

 user@')">><script>alert()<script><".com

Note	Previewing our results, none of the scanners detected the presence of XSS with this test.

Complex XSS Tests These tests were designed to find complex XSS variants, where canonicalization and decoding weaknesses must be exploited to successfully identify the XSS vulnerability. Our test was comprised of a vulnerable parameter in our sample PHP PostNuke "secure" portal with AntiHacker enabled (we used PostNuke version 0.7.5, with the blocks module containing the vulnerable parameter). By manually double-encoding our payload (first hex, then URL), we can successfully exploit XSS in this parameter. Our manual attack worked reliably, and could be passed entirely in URI (even formatted by a browser), or embedded in an HTML forum, or sent by malicious phishers in a pretty HTML e-mail. We thought this would provide a challenging test for the scanners, but also be realistic, since it exists in of-the-shelf software like PostNuke.

Note	Most scanners consistently failed to detect this XSS type, as well as variants based on partial encoding of attack string elements, and the use of specific HTML tags like body elements and background.

Flash

These tests attempted to determine if the scanners could detect abuse-able Macromedia Flash File Format (SWF) content embedded in the test app. We created multiple types of SWF files to represent menus using Flash-based navigation: a basic flat SWF and a multitree expanding menu SWF. Both of the SWF files in our testing applications receive their links via relative paths passed as initialization variables embedded in the web page to the SWF navigation button action. This was perceived to be the easiest SWF to test, as the paths are stored in text in the body of the web page, and very easy to identify in the page source, by variable and link, as shown here:

There are other ways SWF files can receive input that we did not test, including hard coding inside the SWF file itself, and retrieving it from another SWF or server-side code. This last example is the most difficult to test because the SWF must be sandboxed and the attacker must listen for connections made by the SWF and see where it retrieves data from.

Note	Only one scanner (Acunetix) actually found our test SWFs but could not effectively parse them for input vulnerabilities.

SQL Injection (SQLi)

For these tests, we used so-called "blind" SQL injection, where the attacker is denied the privilege of detailed SQL error messages, like Microsoft's classic OLEDB errors, which point out detailed syntax issues (which we like to call "hacker debuggers ") that are commonly used by intruders to craft further attacks. We created two test scenarios to analyze the automated scanners' ability to identify blind SQL injection in our web applications, both based on MSSQL Server 2000.

Note	Not all "blind" SQL injection is equal; some types can be detected by the automated scanners.

SQL Injection Using a Stored Procedure For our first SQL injection test scenario, we used a stored procedure ( sproc ) to perform a login function for a web application. The web application login form takes the userID and password values and passes them to the stored procedure, which then performs a comparison function to the values in the database to decide whether or not the userID and password are legitimate.

The stored procedure was purposefully created using a dynamic SQL query taking the explicit, unfiltered , user-supplied data for userID and password and forming a concatenated string to execute as a query. Exploitation of such concatenated variables is rather straightforward, and injecting the usual suspects ('. --, and so on) does the trick, as we illustrated in Chapter 7.

We selected this vulnerability because it is quite common in real world applications, particularly where developers assume stored procedures are "more secure" by virtue of security through obscurity. Dynamic queries executed in server-side code have the same problem, and developers often assume disabling error messages is enough to deter attack. These deterrents are futile, particularly where the attacker is a recently laid-off developer who wrote the query capable of executing the attack.

SQL Injection Using a Trigger Our second SQL injection test scenario was a bit more complex. We created a SQL trigger and placed it on a table called "IPOMagic_users" that contained sensitive user data (for example, credit card numbers and Social Security Numbers , SSNs). The purpose of the trigger is to restrict access to the credit card or SSN fields. Whenever a process attempts a create, read, update, or delete (CRUD) query against the IPOMagic_users table, the trigger executes a query that requests the user session object (a session cookie in the case of our test app), and then executes a dynamic query against the session database to verify that the cookie exists before allowing that process to take action on the table on behalf of the user. The assumption here is that a request without an associated valid session cookie may be a malicious hacker attempting to abuse the system.

To understand the danger in the practice, consider the hypothetical malicious hacker, t0rn@d0, who is not a valid user of the system, and as such lacks a legitimate session cookie to access this table. However, t0rn@d0 is not concerned with providing the application with a valid session cookie; he simply creates a new one for himself by injecting it into the table using something like the following syntax:

 Cookie=sessionID=13AEDF') OR ('1'='1

(Note that this assumes the application will accept an arbitrary session cookie value from the user for this query function.) Now, when the IPOMagic_users trigger fires to evaluate whether or not t0rn@d0 should have legitimate access to the sensitive data, t0rn@d0's injected cookie syntax is parsed, which returns a value of "true," which results in the security trigger fetching the first cookie it finds in the session database and informing the application that our hacker is good to go. She may CRUD away; IPOMagic's sensitive data is now officially 0wn3d by t0rn@d0.

Note	In both of the above cases, we could have executed a simple attack with the goal of performing a system-wide denial of service on the sample application: ' DROP TABLE IPOMagic_Users ;--

Our SQL injection trigger scenario is somewhat contrived, as triggers rarely rely on user supplied data, but this area of SQL security has been mostly ignored. As late as 2003 there were database "security" products on the market that relied entirely upon triggers operating at an excessive privilege level that in certain cases utilized data (like cookies/ session tokens), which a malicious attacker could have easily replaced with SQL syntax.

Note	Thanks to David Litchfield of NGS Software for his help evaluating several of the implications of SQL injection in the triggers we used in this testing.

Log Analysis

Does the scanner have the ability to analyze logs for attacks or other errant behaviors relevant to web application security testing?

Top-10

Does the scanner have a range of tests that cover at least the basic defaults described in the Open Web Application Security Project (OWASP) Top-10? (See "References and Further Reading" for a link to the OWASP site).

Reporting

Does the scanner provide more than one-time reports that are capable of being compared and trending results over time? Do the results contain information useful for not only security testers, but mitigation advice relevant to developers, and does it describe findings using any of the commonly accepted criteria like the OWASP Top-10, the Web Application Security Consortium (WASC) attack taxonomy (see "References and Further Reading"), or both?

Reviews Of Individual Scanners

Now let us look at the specific results the tools in this assessment produced. Each of the tools analyzed had specific strengths and weaknesses, and we will focus here on demonstrating some of the more interesting results from each tool. (Many tools produced similar if not identical results, but we selected this sample due to preference for or uniqueness of GUI where multiple similar results were observed .)

Acunetix Enterprise Web Vulnerability Scanner (WVS) 3.0

There were many features we found highly appealing with WVS, like the ability to view and edit (customize) all the checks performed, and the inclusion of a fuzzer to attempt brute-forcing parameter values, a task that by definition requires automation.

Acunetix WVS was also the only scanner of the bunch that was able to enumerate all the Macromedia SWFs we pointed it at (although it required some manual intervention to accomplish this). It was, however, unable to detect either the pattern of commonly named pages or their susceptibility to XSS attack.

Figure 13-1 shows a list of pages that WVS ran tests on. Note that it skips from page 2 to page 4 of the application, failing to detect any of our implanted test XSS vulnerabilities. As Figure 13-1 also shows, we were able to manually exploit the XSS through a browser using WVS as a proxy.

Figure 13-1: Acunetix Web Vulnerability Scanner looking for XSS

Cenzic Hailstorm 3.0

We have been working with Cenzic since the turn of the century (whew, that makes us seem old!), when they released their first generation protocol-fuzzing tool. We were thus naturally quite excited to get our hands on the third-generation Hailstorm 3.0, and in many ways the tool lived up to our hopes. Previous performance issues and deficiencies in default checks were vastly improved. Hailstorm provides a logical segregation between crawling a web application, which it calls "traversals," and security testing the application, which it calls "SmartAttacks." There are multiple types of traversals, logically organized in a fashion superior to any other tool we analyzed.

The single most important feature we like about Hailstorm is the ability to get under the hood and tweak and tune the vast array of tests provided. Hailstorm's graphical user interface (GUI) provided us with an intuitive way to identify parameters enumerated during traversals, and tamper with them to suit our XSS attack needs. Figure 13-2 illustrates this powerful feature.

Figure 13-2: Cenzic Hailstorm permits tampering with identified query string parameters.

On the downside, we found Hailstorm's default XSS checks to be less extensive than we had hoped (of course, using the ease of extensibility with the default checks, we overcame this with some manual effort). We also found some GUI issues that were not immediately intuitive to us, but these were minor compared to the overall functionality that Hailstorm offered.

Ecyware GreenBlue Inspector 1.5

Ecyware GreenBlue Inspector's default configuration provides limited automation functionality relative to the other scanners in this roundup. While it is possible to define unit tests and build automated checks for a specific application, it lacks the ease of use and functionality of the other tools when it comes to its overall customization feature set.

GreenBlue Inspector did stand out during manual testing. This tool also impressed us with its aesthetically pleasing, easy to use, and highly functional user interface. We were able to perform tasks with a swipe of the mouse that in many other scanners took multiple mouse clicks, launching a secondary tool, typing in attack code, and then squinting at the results in a poorly formatted window. We thus highly recommend GreenBlue Inspector for web app security pen testers who perform substantial manual work. Figure 13-3 shows GreenBlue Inspector launching an XSS attack to verify that the developers are not enforcing POST submission on their forms, allowing us to turn this XSS into an e-mail hyperlinked, CSRF/Session-Ridingready attack.

Figure 13-3: Ecyware GreenBlue Inspector easily permits manual tampering with form input fields.

Syhunt Sandcat Suite 1.6.2.1

Syhunt's Sandcat Suite is a relative newcomer to the web application security scanning market. It takes the classic "brute-force" approach of security scanners, providing a large database of "known-file" and "known-vulnerable-web-app" signature checks. It also features the ability to perform custom fault-injection tests, although the bulk of these appear limited to URI-parameter manipulations.

We liked the GUI and the simplicity of Sandcat's user model, but during testing we found the tool to be one of the slowest we tested. It also failed to discover most of the vulnerabilities found by the other tools. Although we had a very positive experience working with the product's development team, Sandcat Suite is a true 1.x release, and at this point in time we could only recommended it for the most basic due-diligence checking on applications that do not require stateful authentication or advanced testing.

We did find a couple Sandcat Suite features that were unique to only one other product in our review (N-Stalker): web server log analysis and web server configuration hardening. While the benefit of being able to securely configure a web server through your web application security assessment tool is obvious, we were unsure about the log analysis feature until we tried it on one of the author's personal web servers hosting several applications live on the Internet, as shown in Figure 13-4.

Figure 13-4: Syhunt Sandcat's web log analysis tool was unique among the commercial tools we tested.

Due to Sandcat's database of testing attacks, it could quickly detect similar patterns of attack in our web server logs. In fact, it immediately revealed to us the following important signatures in our own web logs:

We could see what we were logging, and what we were failing to log from a security standpoint.
We could see how lots of other folks around the world were "testing" us.
We could quickly identify which of the authors' friends had been "validating" the security mechanisms of our test applications (nice try, guys).

This, as you can see, is quite useful information to integrate into an automated web application assessment tool.

SPI Dynamics WebInspect 5.8

SPI Dynamics was one of the first vendors to create an automated web application assessment tool and have arguably one of the most mature and useful tools in this space.

Note	Obvious disclaimerwhile an SPI Dynamics founder is a co-author of this book, he was not involved in the testing and analysis described in this chapter.

WebInspect has come a long way since its first release, and 5.8 brought us one of the most fully-featured tools in our test lineup. The 5.8 release put WebInspect in the clear lead for most types of XSS testing that we performed, followed closely behind by Watchfire. The manual testing toolkit included with WebInspect is one of the best available, and if we have a complaint, it is that the tools are not well-integrated and lacked the ability to import and export data from saved files in certain cases. Figure 13-5 shows WebInspect's manual toolkit validating an XSS attack on our XSSGen application.

Figure 13-5: SPI Dynamics WebInspect toolkit manually validates an XSS vulnerability.

While WebInspect has significant strengths in automatic scanning, some of the best wizards for configuring custom checks, and possibly the most powerful framework for complex custom checks, we still discovered some minor limitations while testing.

For example, although WebInspect was great for generating new custom checks from scratch, it didn't let us "get under the hood" to tweak and tune existing checks. If you cannot view the presupplied tests, how do you know if you need to write a custom test? This lack of visibility was somewhat frustrating.

Another source of frustration was WebInspect's lack of flexible scheduling features. While tools like Hailstorm allow you to crawl an application in a variety of ways, and schedule testing of that crawl for later (even specifying a "recrawl" to fetch fresh session tokens), WebInspect gives you an all-or-nothing option. You either schedule a full automated crawl and test, or you perform it manually. This is an unrealistic limitation for production web applications that can only be scanned during limited maintenance windows . Ideally, you would crawl the application during the day and build your tests to run once a month during the maintenance window, but with WebInspect, you'll be stocking up on your caffeinated beverage of choice and coming back to the office late at night.

Watchfire AppScan 6

Ahh, we remember fondly when tiny Perfecto Technologies produced one of the world's first web application security scanners back in the late 1990s. Even after a name change (to Sanctum in 2000) and an acquisition (by Watchfire in 2004), AppScan remains one of the leading web application security assessment tools on the market.

We could lavish many of the same superlatives on AppScan as we did on WebInspect. AppScan distinguishes itself for a few reasons. It is the only tool we tested that accurately identifies the presence of vulnerability to extended UTF-8encoded XSS attacks. It also has some of the most advanced JavaScript parsing ability on the market (WebInspect is comparable). During our reporting/analytics testing, AppScan was one of the top performers.

Furthermore, although AppScan produced false positives like all of the tools in our comparison, it gave us far fewer false positives than most of the automated tools. AppScan was also one of the best at detecting XSS, being one of the only tools to correctly identify the vulnerable parameter in our "complex" PostNuke XSS test, as shown in Figure 13-6.

Figure 13-6: AppScan was one of the only scanners to pass the complex XSS test we designed.

We did have some complaints about AppScan. The default crawler configuration is sometimes too aggressive , going into seemingly endless loops crawling dynamic applications. Of course, this is a two-sided coin: there were many times during our testing that AppScan was the only tool to automatically find certain pages in our test applications, let alone perform testing on them.

We'll also single out AppScan (perhaps unfairly) to illustrate the security scanning industry's collective tendency towards over-zealous marketing. Watchfire, like many other vendors, sometimes gets the feature bullet point in the marketing literature before the featureahemworks. For example, AppScan claimed to be able to parse Macromedia Flash, and for the life of us, we could not get AppScan to parse SWF files in any manner we tried, automatic, manual, or using it as a proxy, as shown in Figure 13-7.

Figure 13-7: AppScan overlooks some Flash files on our test app.

Comparison Tools

As noted earlier, we also performed some limited testing with a few popular (and free) "security consultant toolbox" programs more suited to manual penetration testing, in order to provide a reference comparison.

We also ran Compuware DevPartner SecurityChecker 2.0 through our test battery. SecurityChecker is a source-code-analysis/fault-injection/web-scanning suite that we were interested in comparing with the purebred scanners.

Here are a few thoughts about how some of these tools stack up.

N-Stalker N-Stealth 5.8 (free edition) N-Stealth has been around longer than many of the commercial tools we analyzed and is well regarded by many groups that perform vulnerability analysis. The strength of N-Stealth is in rapidly finding known-vulnerable CGI scripts and files, as well as general web server configuration issues. Figure 13-8 shows N-Stealth's HTML reporting format.

Figure 13-8: N-Stealth's HTML reporting format

However, N-Stealth lacks the capability of actually injecting faults like XSS or SQL injection attacks into the application and analyzing the responses, so it is incapable of finding previously unknown vulnerabilities in new applications. Because of this "known-file" nature of the checks, we found it to produce many false positives and noncontextual results.

Burp Suite 1.01 Burp Suite is a lesser-known suite of tools that encompasses a spider, a proxy, and several manual testing tools. Burp lacks the depth of checks or automation to put it in the category of any of the other tools in this section, but it is included here due to its exceptional value as a low-level manual penetration testing tool. Of all the tools we have evaluated, Burp is clearly designed by someone who fundamentally understands the nuances of testing complex web applications, and presents functionality in a manner most appealing to someone needing to squeeze every drop of complex security vulnerability blood out of their web applications in an effective manner.

For example, Burp's fuzzer provides extensive payload configuration and delivery options. Figure 13-9 shows Burp Intruder testing a common web application with many parameters simultaneously .

Figure 13-9: Burp Intruder's parameter injection flexibility and granularity make it a powerful choice for pen-testers.

We hope the vendors of the commercial scanning tools will build this level of granularity into their parameter testing checks. We could give many examples of how we luckily stumbled upon the right magic value-combinations of three or more large parameters that gave us the keys to the kingdom. Burp Intruder saved us many hours by helping us efficiently discover input validation vulnerabilities in large applications.

Compuware DevPartner SecurityChecker 2.0 Increasingly, multifunction web security suites are popping up that combine black box remote scanning capabilities with development environmentintegrated QA validation and source-code analysis capabilities. In fact, SPIDynamics' DevInspect offers similar functionality. Although it strays somewhat from our intended focus on IT operations in this chapter, we thought it would be interesting to run one such product through our test battery to see how it compared to the purebred scanners.

We decided to test a third product rather than duplicate the testing for other contestants who offered multifunction capabilities like SPIDynamics. Compuware is a well-known name in the software development world and offers a number of productivity enhancement tools for software developers. Recently Compuware made their first foray into the software security space, DevPartner SecurityChecker.

SecurityChecker 2.0's development environment QA component is focused exclusively on .NET applications and runs as a plug-in to Visual Studio 2003 and 2005. When you want to analyze an application for defects with security implications, you first open a project in Visual Studio and then launch SecurityChecker from within Visual Studio.

After providing some minor configuration (such as the path to where the published web application will run), SecurityChecker takes over and provides completely automated analysis. The results we got from SecurityChecker were definitely interesting, and in some cases provided insight no other tool in the lineup provided.

One unique insight that SecurityChecker provides is into the privilege level of the application; during conversion of one of our test applications to .NET 2.0, we ran into several privilege- related errors. Facing the usual publishing deadlines, we granted excessive privileges to the accounts the application was running under, in classic "get it out the door" software development style. SecurityChecker caught this and provided a nice detailed analysis of excessive privileges, and in some cases, the implications of what an attacker who compromised the application could do with that privilege level. This output is shown in Figure 13-10.

Figure 13-10: Compuware DevPartner SecurityChecker reveals poor authorization design in one of our test apps.

We found that DevPartner SecurityChecker's source code analysis functionality was limited. While some of the more advanced commercial source-code analyzers attempt to walk the code path, and provide insight into iterative functions, SecurityChecker appeared to provide information more along the lines of static signature matching and "dangerous method" flagging. When analyzing the source code to our Flash-based test application, the only .NET application in our testing, the only finding it came up with was to identify that the PageValidators were disabled. It also failed to identify any of our canned XSS exploits embedded in our test apps.

Now for the meat and potatoes: automated scanning. Refreshingly, DevPartner SecurityChecker performed much like the other scanners in our shootout, with the expected lack of maturity in some areas in comparison to the purebreds. In fact, SecurityChecker was the only tool to attempt injection of arbitrary parameters and flag issues if the same parameters return in the URL string. If only they had gone the next step and tried injecting XSS attacks into those parameters, they would have gotten the gold in this particular test.

We were disappointed that SecurityChecker didn't provide any way to "get under the hood"we could find no way to add our own custom checks to either the source code scanner or the web application scanner. As we hope to have demonstrated by now, inability to customize checks or facilitate manual analysis significantly limits the usefulness of any automated tool when facing applications of even moderate size and complexity.

Nevertheless, we believe this product has a lot of potential if Compuware remains committed to this space.

Overall Test Results

Now, the answer you've all been waiting for: who's the best?

Although we performed extensive testing and spent many hours in the lab with each of these tools, plus many late-night phone calls with their product development teams , we're wary of stack-ranking the winners in this overall solid bunch of web application security scanners. Obviously, the decision to purchase any of these tools for deployment in a complex medium-to-large enterprise environment will be based on many factors beyond the handful we used in our testing. Nevertheless, we think we can make some recommendations based on our experiences. The most mature tools in the bunch are Watchfire, SPI, and Cenzic. We'd be hard-pressed to pick between these three based on the quality of checks, customize-ability, and usability. The next tier includes Syhunt and Acunetix, which excelled at certain tasks and presented some innovative features but just didn't quite rise to the overall polish of the first three. Finally, Ecyware, while a standout in manual testing, wasn't able to match the automation capabilities of the overall field. We hope this gives you a head start on your scanner procurement process.

Perhaps more interesting than our admittedly subjective ranking of the contestants, some of the themes we observed during testing are listed here:

Most disappointingly, no scanner could reliably detect the blind SQL injection "Easter eggs" in our test apps. Worse, the scanners' marketing literature claimed to be able to detect these types of issues, giving us a false sense of confidence that our apps were free of such vulnerabilities.
Customization features were rudimentary for most tools, preventing us from "getting under the hood" to design our own custom checks, a real necessity for scanning web applications that typically deflect "template-ized" generic tests. Cenzic Hailstorm was a notable exception here.
Hailstorm was the only tool able to perform both authorized and unauthorized testing as one single scheduled job.
Differential analysis capabilities were weakwe had to implement custom checks to achieve our goal of verifying that Rob can't access Sally's reports.
Only one scanner could find our "complex" XSS vulnerability in an off-the-shelf web application software package.
Only two scanners could reliably detect XSS vulnerabilities using alternate tags from RSnake's web site (SPI and Watchfire).
No scanner could detect XSS vulnerabilities using double-encoded payloads.
Despite multiple vendors listing Flash auditing capabilities in their marketing literature, only one could actually find our sample SWF files.
Although most scanners could technically say they covered the OWASP Top 10 vulnerabilities, we found that the depth of checks in each OWASP category was quite uneven across the tools.
Only one tool performed security analysis of web application logs.

We expect that the scanner vendors will address many of these issues in upcoming releases. For example, after initial testing phases, we had discussions with multiple vendors about their XSS detection issues; subsequently, several vendors released updates to their product or assigned product development staff to actively work with us to address these issues.

Next, we'll discuss some of these and other themes we identified in more detail.

Manual Versus Automated Capabilities

We discovered more bugs in manual testing components of the commercial scanners than expected, even in default status code signatures, which supports our suspicion that few of the manual add-ons are actually used by owners of automated scanners. The few individuals who may use them likely have the skill to identify and change broken defaults, and lack the time to spend with vendor support to get them fixed.

Speed Versus Depth

We suspect that in-depth testing of attacks is not fully implemented in many of the tools due to the fact that there are still potential customers of web application vulnerability scanners who evaluate the scanners based upon "speed." This is largely an arbitrary criterion, since testing for more complex issues requires a scanner to make more requests to the application. The scanner with the most thorough XSS testing engine is simultaneously the most likely to lose in a speed-based bakeoff due to the significant number of tests required to properly identify vulnerability to XSS attacks. A vendor whose tool has limited or ineffective XSS checks has the better odds of being "faster" by virtue of performing less work. We hope the disappointing practice of evaluating scanners primarily on speed is balanced with a focus on quality of analysis.

False Positives

We've given rather short shrift to this topic, the bane of security vulnerability scanners. This is not to say that we didn't encounter our fair share during testing. For example, during our preliminary evaluations of possible contestants, we found a product with a bug in its handling of a specific HTTP status code that caused significant XSS false positives. This scanner parsed the body response of all HTTP 302 redirects and flagged any data in the 302 as a valid exploit. While information leakage via HTTP 302 redirects is important to analyze, and not one tool identified this potential for information leakage, a web browser will not execute any code in the body of a 302 redirect. In fact, only the very first generation web browsers, like lynx, will even display the body of a 302, and these browsers cannot execute body script.

Hopefully, this serves as a reminder that all of these tools require tuning, no matter what environment they are deployed into.

Reporting

Our main criteria here were that each tool provide basic analytics, including trending across scans , detailed technical information about each identified attack, mitigation information for both IT admins and developers, and be organized using commonly accepted terminology like the OWASP Top 10 or WASC Attack Classifications.

The majority of tools we reviewed meet several, if not all, of these criteria. Many have a full-featured reporting database, capable of running trending reports across multiple tests, provide developer-specific information (although usually quite limited or language-specific), and several utilize a commonly accepted classification system.

Black-box Scanning Versus White-box Analysis

When is it more cost-effective to look for vulnerabilities using white-box methodologies like those discussed in Chapter 12? Our testing revealed that it's better to focus efforts on the development cycle to find and remediate some classes of vulnerabilities.

XSS Looking at the example of our XSS testing, it is fairly clear what the application is doing from examining the source, and one could write a set of signatures to pull up every instance of data being written out to a page from a variable that is potentially user-supplied. However, this would generate a significant amount of nonsecurity noise to wade through, and would not reveal which pieces of data being written out to the page were already strongly validated as input.

Scanning source code to ensure that all output was properly encoded would, however, stop virtually all types of XSS attacks. Considering most applications have a finite number of places they write potentially user-supplied or untrusted data out to the page, we believe that the only thorough answer to meeting our aforementioned business goals is to combine manual penetration testing with source code analysis.

SQL Injection Since we utterly failed to detect blind SQL injection using automated web application scanning, but were able to identify potential for abuse immediately upon looking at the queries behind the scenes, it is clear that the most effective way to identify this is to examine the source code. Whether or not we can automate that analysis effectively remains to be seen. Several automated source code analyzers can identify a dynamic SQL query in a page, and even in a procedure, but we haven't seen one that would know to extract triggers from a table in a database and perform analysis on those.

IDS Overload

We thought we'd share the following amusing anecdote about our testing experience to end our testing showdown on a lighter note.

Deep into our test regime , we unwittingly implemented a denial-of-service attack against our security analysis efforts. In order to log all attack patterns thrown by the scanners, we implemented a PHP module on one of our test apps that was designed to act like a rudimentary intrusion detection system (IDS): dump all system state and the contents of suspicious strings, and e-mail them to an account we set up on our mail server for monitoring.

Unfortunately, our e-mail client, Outlook 2003, did not perform well under the load of messages generated by our IDS module. In a single five-day series of testing that exceeded 1 gigabyte of HTTP requests per day, we generated over 2 gigabytes of total IDS e-mail alerts (in large part due to the logging detail in the IDS alerts).

This problem was exacerbated by the fact that several of the scanners had a tendency to go into endless loops when they encountered complex JavaScript, or subdirectories that responded with custom error pages (i.e., HTTP 200 OK). Other scanners were aggressive about blindly submitting extensive tests unrelated to the language or nature of the page to every available form and parameter that they could enumerate.

Worse, we could not download and delete the volume of mail with any of our readily available POP3 or IMAP Windows-based e-mail clients even the server-side web-based mail client would no longer log us in. We finally had to log into a command shell and manually delete the mail spool files.

Memo to self (and anyone else who's listening): remember to tune your IDS before attacking yourself en masse, as real attacks could have easily slid under the radar of the volume of noise generated by testing.