Infrastructure Profiling | HACKING EXPOSED WEB APPLICATIONS, 3rd Edition

Web applications require substantial infrastructure to supportweb server hardware/software, DNS entries, networking equipment, load balancers, and so on. Thus, the first step in any good web security assessment methodology is identification and analysis of the low-level infrastructure upon which the application lies.

Footprinting And Scanning: Defining Scope

The original Hacking Exposed introduced the concept of footprinting , or using various Internet-based research methods to determine the scope of the target application or organization. There are numerous tools and techniques traditionally used to perform this task, including:

Internet registrar research
DNS interrogation
General organizational research

The original Hacking Exposed methodology also covered basic infrastructure reconnaissance techniques such as:

Server discovery (ping sweeps )
Network service identification (port scanning)

Since most World Wide Webbased applications operate on the canonical ports TCP 80 for HTTP and/or TCP 443 for HTTPS /SSL/TLS, these techniques are usually not called for once the basic target URL has been determined. A more diligent attacker might port scan the target IP ranges using a list of common web server ports to find web apps running on unusual ports.

Tip	See Chapter 10 for discussion of common attacks and countermeasures against web-based administration ports.

Caution

Don't overlook port scanningmany web applications are compromised via inappropriate services running on web servers or other servers adjacent to web application servers in the DMZ.

Rather than reiterating in detail these methodologies that are only partially relevant to web application assessment, we recommend that readers interested in a more expansive discussion consult the other editions of the Hacking Exposed series (see the "References and Further Reading" section at the end of this chapter for more information), and we'll move on to aspects of infrastructure profiling that are more directly relevant to web applications.

Basic Banner Grabbing

The next step in low-level infrastructure profiling is generically known as banner grabbing. Banner grabbing is critical to the web hacker, as it typically identifies the make and model (version) of the web server software in play. The HTTP 1.1 specification (RFC 2616) defines the server response header field to communicate information about the server handling a request. Although the RFC encourages implementers to make this field a configurable option for security reasons, almost every current implementation populates this field with real data by default (although we'll cover several exceptions to this rule momentarily).

Tip	Banner grabbing can be performed in parallel with port scanning if the port scanner of choice supports it.

Here is an example of banner grabbing using the popular netcat utility:

 D:\>  nc -nvv 192.168.234.34 80  (UNKNOWN) [192.168.234.34] 80 (?) open  HEAD / HTTP/1.0   [Two carriage returns]  HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Fri, 04 Jan 2002 23:55:58 GMT [etc.]

Note the use of the HEAD method to retrieve the server banner. This is the most straightforward method for grabbing banners.

There are several easier-to-use tools that we use more frequently for manipulating HTTP, which we already enumerated in Chapter 1. We used netcat here to illustrate the raw input-output more clearly.

Advanced Http Fingerprinting

Knowing the make and model of the web server was usually sufficient in the past to submit to Google or Bugtraq and identify if there were any related exploits (we'll discuss this process in more depth in Chapter 3). As security awareness has increased, however, new products and techniques have surfaced that now either block the server information from being displayed, or report back false information to throw attackers off.

Alas, information security is a never-ending arms race, and more sophisticated banner grabbing techniques have emerged that can be used to determine what a web server is really running. We like to call the HTTP-specific version of banner grabbing fingerprinting the web server, since it no longer consists of simply looking at header values, but rather observing the overall behavior of each web server amongst a farm and how individual responses are unique among web servers. For instance, an IIS server will likely respond differently to an invalid HTTP request than an Apache web server. This is an excellent way to determine what web server make and model is actually running and why it's important to learn the subtle differences among web servers. There are many ways to fingerprint web servers, so many in fact that fingerprinting is an art form in itself. We'll discuss a few basic fingerprinting techniques next.

Unexpected HTTP Methods

One of the most significant ways web servers differ is in how they respond to different types of HTTP requests . And the more unusual the request, the more likely the web server software differs in how it responds to that request. In the following examples, we send a PUT request instead of the typical GET or HEAD, again using netcat. The PUT request has no data in it. Notice how even though we send the same invalid request, each server reacts differently. This allows us to accurately determine what the web server really is even though they changed the server banner. The areas of difference are bolded in the examples shown here.

Sun One Web Server

$ nc sun.site.com 80
PUT / HTTP/1.0
Host: sun.site.com

HTTP/1.1 401 Unauthorized
Server: Sun-ONE-Web-Server/6.1

IIS 6.0

$ nc iis6.site.com 80
PUT / HTTP/1.0
Host: iis6.site.com

HTTP/1.1 411 Length Required
Server: Microsoft-IIS/6.0
Content-Type: text/html

IIS 5.x

$ nc iis5.site.com 80
PUT / HTTP/1.0
Host: iis5.site.com

HTTP/1.1 403 Forbidden
Server: Microsoft-IIS/5.1

Apache 2.0.x

$ nc apache.site.com 80
PUT / HTTP/1.0
Host: apache.site.com

HTTP/1.1 405 Method Not Allowed
Server: Apache/2.0.54

Server Header Anomalies

By taking a close look at the HTTP headers within different servers' responses, you can determine subtle differences. For instance, sometimes the headers will be ordered differently, or there will be additional headers from one server compared to another. This can indicate the make and model of the web server.

For example, on Apache 2.x, the Date: header is on top and is right above the Server: header, as shown here in the bolded text:

 HTTP/1.1 200 OK  Date: Mon, 22 Aug 2005 20:22:16 GMT   Server: Apache/2.0.54  Last-Modified: Wed, 10 Aug 2005 04:05:47 GMT ETag: "20095-2de2-3fdf365353cc0" Accept-Ranges: bytes Content-Length: 11746 Cache-Control: max-age=86400 Expires: Tue, 23 Aug 2005 20:22:16 GMT Connection: close Content-Type: text/html; charset=ISO-8859-1

On IIS 5.1, the Server: header is on top and is right above the Date: headerthe opposite of Apache 2.0:

 HTTP/1.1 200 OK  Server: Microsoft-IIS/5.1   Date: Mon, 22 Aug 2005 20:24:07 GMT  Connection: Keep-Alive Content-Length: 6278 Content-Type: text/html Cache-control: private

On Sun One, the Server: and Date: header ordering matches IIS 5.1, but notice how on Content-length: that 'length' is not capitalized. The same applies with Content-Type:, but for IIS 5.1 they are capitalized:

 HTTP/1.1 200 OK Server: Sun-ONE-Web-Server/6.1 Date: Mon, 22 Aug 2005 20:23:36 GMT Content-  length  : 2628 Content-  type  : text/html Last-modified: Tue, 01 Apr 2003 20:47:57 GMT Accept-ranges: bytes Connection: close

On IIS 6.0, the Server: and Date: header ordering matches that of Apache 2.0, but there is a Connection: header above them:

 HTTP/1.1 200 OK  Connection: close  Date: Mon, 22 Aug 2005 20:39:23 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET X-AspNet-Version: 1.1.4322 Cache-Control: private Content-Type: text/html; charset=utf-8 Content-Length: 23756

The httprint Tool

We've covered a number of techniques for fingerprinting HTTP servers. Rather than performing these techniques manually, we recommend the httprint tool from NetSquare (see the "References and Further Reading" at the end of this chapter for a link). Httprint performs most of these techniques (such as examining the HTTP header ordering) in order to skirt most obfuscation techniques. It also comes with a customizable database of web server signatures. Httprint is shown fingerprinting some web servers in Figure 2-1.

Figure 2-1: Httprint tool and results

Infrastructure Intermediaries

One issue that can skew the outcome of profiling is the placement of intermediate infrastructure in front of the web application. This intermediate infrastructure can include load balancers, virtual server configurations, proxies, and web application firewalls. Next, we'll discuss how these interlopers can derail the basic fingerprinting techniques we just discussed and how they can be detected .

Virtual Servers

One other thing to consider is virtual servers. Some web hosting companies attempt to spare hardware costs by running different web servers on multiple virtual IP addresses on the same machine. Be aware that port scan results indicating a large population of live servers at different IP addresses may actually be a single machine with multiple virtual IP addresses.

Detecting Load Balancers

Since load balancers are usually "invisible," many attackers neglect to think about them when doing their assessments. But load balancers have the potential to drastically change the way you do your assessments. Load balancers are deployed to help make sure that no single server is ever overloaded with requests. Load balancers do this by dividing web traffic between multiple servers. For instance, when you issue a request to a web site, the load balancer may defer your request to any one out of four servers. What this type of setup means to you is that while one attack may work on one server, it may not work the next time around if it's sent to a different server. This can cause you much frustration and confusion. While in theory all of the target's servers should be replicated identically and no response from any of the servers should be different than any other, this just simply isn't the case in the real world. And even though the application may be identical on all servers, its folder structure (this is very common), patch levels, and configurations may be different on each server where it's deployed. For example, there may be a "test" folder left behind on one of the servers, but not on the others. This is why it's important not to mess up any of your assessments by neglecting to identify load balancers. Here's how you try to detect if a load balancer is running at your target's site.

Port Scan Surrounding IP Ranges One simple way to identify individual load balanced servers is to first determine the IP address of the canonical server and then script requests to a range of IPs around that. We've seen this technique turn up several other nearly identical responses, probably all load-balanced , identical web servers. Infrequently, however, we encounter one or more servers in the farm that are different from the others, running an out-of-date software build or perhaps alternate services like SSH or FTP. It's usually a good bet that these rogues have security misconfigurations of one kind or another, and they can be attacked individually via their IP address.

TimeStamp Analysis One method of detecting load balancers is analyzing the response timestamps. Because a lot of servers may not have their times synchronized, you can determine if there are multiple servers by issuing multiple requests within one second. By doing this you can analyze the server date headers. And if your requests are deferred to multiple servers, there will likely be variations in the times reported back to you in the headers. You will need to do this multiple times in order to reduce the chances of false positives and to be able to see a true pattern emerge. If you're lucky, each of the servers will be off-sync and you'll be able to then deduct how many servers are actually being balanced.

Etag and Last-Modified Differences By comparing the Etag and Last-Modified values in the header responses for the same requested resource, you can determine if you're getting different files from multiple servers. For example, here is the response for index.html multiple times:

 ETag: "20095-2de2-3fdf365353cc0" ETag: "6ac117-2c5e-3eb9ddfaa3a40" Last-Modified: Sun, 19 Dec 2004 20:30:25 GMT Last-Modified: Sun, 19 Dec 2004 20:31:12 GMT

The difference in Last-Modified timestamps between these responses indicates that the servers did not have immediate replication and that the requested resource was replicated to another server about a minute apart.

Load Balancer Cookies Some proxy servers and load balancers add their own cookie to the HTTP session so that they can keep better state. These are fairly easy to find, so if you see an unusual cookie you'll want to conduct a Google search on it to determine its origin. For example, while browsing a web site we noticed this cookie being passed to the server:

AA002=1131030950-536877024/1132240551

Since the cookie does not give any obvious indications as to what application it belongs to, we did a quick Google search for "AA002=" and turned up multiple results of sites that use this cookie. On further analysis it was found that the cookie was a tracking cookie that was called "Avenue A". As a general rule, if you don't know it, then Google it!

Enumerating SSL Anomalies This is a last-ditch effort when it comes to identifying proxies and load balancers. If you're sure that the application is, in fact, being load balanced but none of the methods listed above work, then you might as well try to see if the site's SSL certificates have differences in them, or whether the SSL certificates each support the same cipher strengths. For example, one of the servers may support only 128-bit encryption, just as it should. But suppose the site administrator forgot to apply that policy to other servers, and they support all ciphers from 96-bit and up. A mistake like this confirms that the web site is being load balanced.

Examining HTML Source Code Although we'll talk about this in more depth when we get to the "Application Profiling" section later in this chapter, it's important to note that HTML source code can also reveal load balancers. For example, multiple requests for the same page might return different comments in HTML source, as shown next (HTML comments are delineated by the <!--brackets):

 <!-- ServerInfo: MPSPPIIS1  B093  2001.10.3.13.34.30 Live1 --> <!-- Version: 2.1 Build 84 -->     <!-- ServerInfo: MPSPPIIS1  A096  2001.10.3.13.34.30 Live1 --> <!-- Version: 2.1 Build 84 -->

One of the pages on the site reveals more cryptic HTML comments. After sampling it five times, the comments were compared, as shown here:

 <!-- whfh  UAXNByd7ATE56+Fy6BE9I3B0GKXUuZuW  --> <!-- whfh6FHHX2v8MyhPvMcIjUKE69m6OQB2Ftaa --> <!-- whfhKMcA7HcYHmkmhrUbxWNXLgGblfF3zFnl --> <!-- whfhuJEVisaFEIHtcMPwEdn4kRiLz6/QHGqz --> <!-- whfhzsBySWYIwg97KBeJyqEs+K3N8zIM96bE -->

It appears that it is an MD5 hash with a salt of "whfh" at the beginning. We're not sure. We'll talk more about how to gather and identify HTML comments in the upcoming section on application profiling.

Detecting Proxies

Not so surprisingly, you'll find that some of your most interesting targets are supposed to be invisible. Devices like proxies are supposed to be transparent to end users, but they're great attack points if you can find them. Listed next are some methods you can use to determine whether your target site is running your requests through a proxy.

TRACE Request A TRACE request tells the web server to echo back the contents of the request just as it had received it. This command was placed into HTTP 1.1 as a debugging tool. But, fortunately for us, it also reveals whether our requests are traveling through proxy servers before getting to the web server. By issuing a TRACE request, the proxy server will modify the request and send it to the web server, which will then echo back exactly what request it received. By doing this we can identify what changes the proxy made to the request.

Proxy servers will usually add certain headers, so look for headers like these:

 "Via:","X-Forwarded-For:","Proxy-Connection:"                  TRACE / HTTP/1.1              Host: www.site.com                      HTTP/1.1 200 OK              Server: Microsoft-IIS/5.1              Date: Tue, 16 Aug 2005 14:27:44 GMT              Content-length: 49                      TRACE / HTTP/1.1              Host: www.site.com              Via: 1.1 192.168.1.5

When your requests go through a reverse proxy server, you will get different results. A reverse proxy is a front-end proxy that routes incoming requests from the Internet to the backend servers. Reverse proxies will usually modify the request in two ways. First, they'll remap the URL to point to the proper URL on the inside server. For example, "TRACE /folder1/index.aspx HTTP/1.1" might turn into "TRACE /site1/folder1/index.asp HTTP/1.1". The other modification is that the Host: header is changed to point to the proper internal server to forward the request to. Looking at the example, you'll see that the Host: header was changed to "server1.site.com."

 HTTP/1.1 200 OK Server: Microsoft-IIS/5.1     Content-length: 49 TRACE / HTTP/1.1 Host:  server1.site.com

Standard Connect Test The CONNECT command is primarily used in proxy servers to proxy SSL connections. With this command, the proxy makes the SSL connection on behalf of the client. For instance, sending a "CONNECT https://secure.site.com:443" will instruct the proxy server to make the connection an SSL connection to secure.site.com on port 443. And if the connection is successful, the CONNECT command will tunnel the user 's connection and the secure connection together. However, this command can be abused when it is used to connect servers inside the network.

A simple method to check if a proxy is present is to send a CONNECT to a known site like www.google.com and see if it complies.

Note	Many times a firewall may well protect against this technique, so you might want to try to guess some internal IP addresses and use those as your test.

The following example shows how the CONNECT method can be used to connect to a remote web server.

 *Request* CONNECT remote-webserver:80 HTTP/1.0 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0) Host: remote-webserver  *Successfull Response* HTTP/1.0 200 Connection established

Standard Proxy Request Another method you might also try is to insert the address of a public web site and see if the proxy server returns the response from that web site. If so, this means that you can direct the server to any address of your choice. This would allow your proxy server to be an open, anonymous proxy to the public or, worse , allow the attacker to access your internal network. This is demonstrated next. At this point, a good technique to use would be to attempt to identify what the internal IP address range of your target is, and then port scan that range.

Tip	This same method can be successfully applied using the CONNECT command as well.

For example, a standard open proxy test using this mechanism would look something like the following:

 GET http://www.site.com/ HTTP/1.0

You could also use this technique to scan a network for open web servers:

 GET http://192.168.1.1:80/ HTTP/1.0 GET http://192.168.1.2:80/ HTTP/1.0

You can even conduct port scanning in this manner:

 GET http://192.168.1.1:80/ HTTP/1.0 GET http://192.168.1.1:25/ HTTP/1.0 GET http://192.168.1.1:443/ HTTP/1.0

Detecting Web App Firewalls

Web application firewalls are protective devices that are placed inline between the user and the web server. The app firewall analyzes HTTP traffic to determine if it's valid traffic and tries to prevent web attacks. You could think of them as Intrusion Prevention Systems (IPS) for the web application.

Web application firewalls are still relatively rare to see when assessing an application, but being able to detect them is still very important. The examples explained in the following sections are not a comprehensive listing of ways to fingerprint web application firewalls, but they should give you enough information to identify one when you run into this defense.

It's actually quite easy to detect whether or not an application firewall is running in front of an application. If, throughout your testing, you keep getting kicked out, or the session times out when issuing an attack request, there is likely to be an application firewall between you and the application. Another indication would be when the web server does not respond the way it usually does to unusual requests but instead always returns the same type of error. Listed next are some common web app firewalls and some very simple methods of detecting them.

Teros The Teros web application firewall technology will respond to a simple TRACE request or any invalid HTTP method such as PUT with the following error:

 TRACE / HTTP/1.0 Host: www.site.com User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)     HTTP/1.0 500 Content-Type: text/html  <html><head><title>Error</title></head><body> <h2>ERROR: 500</h2> Invalid method code<br> </body></html>

Another easy way to detect a Teros box is by spotting the cookie that they issue, which looks similar to this:

st8id=1e1bcc1010b6de32734c584317443b31.00.d5134d14e9730581664bf5cb1b610784)

The value of the cookie will of course change but the cookie name "st8id" is the giveaway, and in most cases, the value of the cookie will have the similar character set and length.

F5 TrafficShield When you send abnormal requests to F5's TrafficShield, you might get responses that contain errors like those listed here. For instance, here we send a PUT method with no data:

 PUT / HTTP/1.0 Host: www.site.com User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)     HTTP/1.0 400 Bad Request Content-Type: text/html  <html><head><title>Error</title></head> <body><h1>HTTP Error 400</h1> <h2>400 Bad Request</h2> The server could not understand your request.<br>Your error ID is: 5fa97729</body></html>

TrafficShield also has a standard cookie that is used with their device. The cookie name is "ASINFO", and here is an example of what the cookie looks like:

 ASINFO=1a92a506189f3c75c3acf0e7face6c6a04458961401c4a9edbf52606a4c47b1c 3253c468fc0dc8501000ttrj40ebDtxt6dEpCBOpiVzrSQ0000

Netcontinuum Detecting a Netcontinuum application firewall deployment is similar to the others. Just look for their cookie. In the event that their cookie is not present, we've noticed that these devices respond to every invalid request with a 404 errorwhich is quite abnormal for any web server to do. The Netcontinuum cookie is shown here:

 NCI__SessionId=009C5f5AQEwIPUC3/TFm5vMcLX5fjVfachUDSNaSFrmDKZ/ LiQEuwC+xLGZ1FAMA+

URLScan

URLScan is a free ISAPI filter that provides great flexibility for controlling HTTP requests, but we don't consider URLScan a true application firewall. Products like these don't provide dynamic protection; instead, they rely on a lengthy configuration file of signatures or allowed lengths to stop attacks. Detecting URLScan can be simple, as long as it is implemented with its default rules.

For example, by default, URLscan has a rule that restricts a path to a length of 260 characters, so if you send a request that has a path of more then 260 characters , URLScan will respond with a 404 (http://www.site.com/(261 /'s)). URLScan will also reject the request if you add any of the following headers to the request:

Translate:
If:
Lock-Token:
Transfer-Encoding:

This will cause URLScan to return a 404. But in any other situation, the web server would just ignore the extra headers and respond normally to the request that you sent it.

Note	We cover URLScan's features extensively in Appendix C.

SecureIIS SecureIIS is like URLScan on steroidsit is a pumped-up commercial version that adds a nice GUI and some nifty features. It's a lot easier to use than editing a big configuration file like URLScan, but detecting it is pretty similar. Study the default rules that it ships with and break themthis will cause SecureIIS to return a deny response, which by default is a 406 error code (Note that the paid-for version allows this to be changed).

One of the default rules is to limit the length of any header value to 1024 characters. So just set a header value above that limit and see if the request gets denied . SecureIIS' Default Deny Page is quite obvious: it states that a security violation has occurred and even gives the SecureIIS logo and banner. Of course, most people using this product in production will have that changed. Observing the HTTP response can be more revealing , as SecureIIS implements an unusual 406 "Not Acceptable" response to requests with overlarge headers.