Common Windows Canonicalization Mistakes

I ve already touched on some of the common canonicalization mistakes made by Windows-based applications, but let s drill into what they are. Windows can represent filenames in many ways, due in part to extensibility capabilities and backward compatibility. If you accept a filename and use it for any security decision, it is crucial that you read this section.

8.3 Representation of Long Filenames

As you are no doubt aware, the legacy FAT file system, which first appeared in MS-DOS, requires that files have names of eight characters and a three-character extension. File systems such as FAT32 and NTFS allow for long filenames for example, an NTFS file can be 255 Unicode characters in length. For backward-compatibility purposes, NTFS and FAT32 by default autogenerate an 8.3 format filename that allows an application based on MS-DOS or 16-bit Windows to access the same file.


	The format of the auto-generated 8.3 filename is the first six characters of the long filename, followed by a tilde (~) and an incrementing digit, followed by the first three characters of the extension. For example, My Secret File.2001.Aug.doc becomes MYSECR~1.DOC. Observe that all illegal characters and spaces are removed from the filename first.

An attacker might slip through your code if your code makes checks against the long filename and the attacker uses the short filename instead. For example, your application might deny access to Fiscal02Budget.xls to users on the 172.30.x.x subnet, but a user on the subnet using the file s short filename would circumvent your checks because the file system accesses the same file, just through its 8.3 filename. Hence, Fiscal02Budget.xls might be the same file as Fiscal~1.xls.

The following pseudocode highlights the vulnerability:

String SensitiveFiles[] = { Fiscal02Budget.xls , ProductPlans.Doc }; IPAddress RestrictedIP[] = {172.30.0.0, 192.168.200.0}; BOOL AllowAccessToFile(FileName, IPAddress) { If (FileName In SensitiveFiles[] && IPAddress In RestrictedIP[]) Return FALSE; Else Return TRUE; } BOOL fAllow = FALSE; // This will deny access. fAllow = AllowAccessToFile( Fiscal02Budget.xls , 172.30.43.12 ); // This will allow access. Ouch! fAllow = AllowAccessToFile( FISCAL~1.XLS , 172.30.43.12 );


	Conventional wisdom would dictate that secure systems do not include MS-DOS or 16-bit Windows applications, and hence 8.3 filename support should be disabled. More on this later.

NTFS Alternate Data Streams

I ve already discussed this canonicalization mistake when describing the IIS ::$DATA vulnerability: be wary if your code makes decisions based on the filename extension. For example, IIS looked for an .asp extension and routed the request for the file to Asp.dll. When the attacker requested a file with the .asp::$DATA extension, IIS failed to see that the request was a request for the default NTFS data stream and the ASP source code was returned to the user.


	You can detect streams in your files by using tools such as Streams.exe from Sysinternals (www.sysinternals.com), Crucial ADS from Crucial Security (www.crucialsecurity.com), or Security Expressions from Pedestal Software (www.pedestalsoftware.com).

Also, if your application uses alternate data streams, you need to make sure that the code correctly parses the filename to read or write to the correct stream. More on this later. As an aside, streams do not have a separate access control list (ACL) they use the same ACL as the file in question.

Trailing Characters

I ve seen a couple of vulnerabilities in which a trailing dot (.) or backslash (\) appended to a filename caused the application parsing the filename to get the name wrong. Adding a dot is very much a Win32 issue because the file system determines that the trailing dot should not be there and strips it from the filename before accessing the file. The trailing backslash is usually a Web issue, which I ll discuss in Chapter 12. Take a look at the following code to see what I mean by the trailing dot:

char b[20]; lstrcpy(b, Hello! ); HANDLE h = CreateFile( c:\\somefile.txt", GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL); if (h != INVALID_HANDLE_VALUE) { DWORD dwNum = 0; WriteFile(h, b, lstrlen(b), &dwNum, NULL); CloseHandle(h); } h = CreateFile( c:\\somefile.txt.", GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (h != INVALID_HANDLE_VALUE) { char b[20]; DWORD dwNum =0; ReadFile(h, b, sizeof b, &dwNum, NULL); CloseHandle(h); }

You can also find this example code on the companion CD in the folder Secureco\Chapter 8\TrailingDot. See the difference in the filenames? The second call to access somefile.txt has a trailing dot, yet somefile.txt is opened and read correctly when you run this code. This is because the file system removes the invalid character for you! As you can see, somefile.txt. is the same as somefile.txt, regardless of the trailing dot.

\\?\ Format

Normally, a filename is limited to MAX_PATH (260) ANSI characters. The Unicode versions of numerous file-manipulation functions allow you to extend this to 32,000 Unicode characters by prepending \\?\ to the filename. The \\?\ tells the function to turn off path parsing. However, each component in the path cannot be more than MAX_PATH characters long. So, in summary, \\?\c:\temp\myfile.txt is the same as c:\temp\myfile.txt.


	No known exploit for the \\?\ filename format exists; I ve included the format for completeness.

Directory Traversal and Using Parent Paths (..)

The vulnerabilities in this section are extremely common in Web and FTP servers, but they re potential problems in any system. The first vulnerability lies in allowing attackers to walk out of your tightly controlled directory structure and wander around the entire hard disk. The second issue relates to two or more names for a file.

Walking out of the Current Directory

Let s say your application contains data files in c:\datafiles. In theory, users should not be able to access any other files from anywhere else in the system. The fun starts when attackers attempt to access ..\boot.ini to access the boot configuration file in the root of the boot drive or, better yet, ..\winnt\ repair\sam to get a copy of the local SAM database file, which contains the usernames and password hashes for all the local user accounts. Now the attacker can run a password-cracking tool such as L0phtCrack (available at www.atstake.com) to determine the passwords by brute-force means. This is why strong passwords are crucial!


	Note that in Windows 2000 and later, the SAM file is encrypted using SysKey by default, which makes this attack somewhat more complex to achieve. Read Knowledge Base article Q143475, Windows NT System Key Permits Strong Encryption of the SAM at support.microsoft.com/support/kb/articles/Q143/4/75.asp for more information regarding SysKey.

Multiple File Names

If we assume a directory structure of c:\dir\foo\files\secret, the file c:\dir\ foo\myfile.txt is the same as c:\dir\foo\files\secret\..\..\myfile.txt, as is c:\dir\foo\files\..\myfile.txt, as is c:\dir\..\dir\foo\files\..\myfile.txt! Oh my!

Absolute vs. Relative Filenames

If the user gives you a filename to open with no directory name, where do you look for the file? In the current directory? In a folder specified in the PATH environment variable? Your application might not know and might load the wrong file. For example, if a user requests that your application open File.exe, does your application load File.exe from the current directory or from a folder specified in PATH?

Case-Insensitive Filenames

There have been no vulnerabilities that I know of in Windows concerning the case of a filename. The NTFS file system is case-preserving but case-insensitive. Opening MyFile.txt is the same as opening myfile.txt. The only time this is not the case is when your application is running in the Portable Operating System Interface for UNIX (POSIX) subsystem. However, if your application does perform case-sensitive filename comparisons, you might be vulnerable in the same way as the Apple Mac OS X and Apache Web server, as described earlier in this chapter.

Device Names and Reserved Names

Many operating systems, including Windows, have support for naming devices and access to the devices from the console. For example, COM1 is the first serial port, AUX is the default serial port, LPT2 is the second printer port, and so on. The following reserved words cannot be used as the name of a file: CON, PRN, AUX, CLOCK$, NUL, COM1 COM9, and LPT1 LPT9. Also, reserved words followed by an extension for example, NUL.txt are invalid filenames. But wait, there s more: each of these devices exists in every directory. For example, c:\Program Files\COM1 is the first serial port, as is d:\NorthWindTraders\COM1.

If a user passes a filename to you and you blindly open the file, you will have problems if the file is a device and not a real file. For example, imagine you have one worker thread that accepts a user request containing a filename. Now an attacker requests \document.txt\com1, and your application opens the file for read access. The thread is blocked until the serial port times out! Luckily, there s a way to determine what the file type is, and I ll cover that shortly.

Device Name Issues on Other Operating Systems

Canonicalization issues are not, of course, unique to Windows. For example, on Linux it is possible to lock certain applications by attempting to open devices rather than files. Examples include /dev/mouse, /dev/console, /dev/tty0, /dev/zero, and many others.

A test using Mandrake Linux 7.1 and Netscape 4.73 showed that attempting to open file:///dev/mouse locked the mouse and necessitated a reboot of the computer to get control of the mouse. Opening file:///dev/zero freezed the browser. These vulnerabilities are quite serious because an attacker can create a Web site that has image tags such as <IMG SRC=file:///dev/mouse>, which would lock the user s mouse.

You should become familiar with device names if you plan to build applications on many operating systems.

UNC Shares

Files can be accessed through Universal Naming Convention (UNC) shares. A UNC share is used to access file and printer resources in Windows and is treated as a file system by the operating system. Using UNC, you can map a new disk drive letter that points to a local or remote server. For example, let s assume you have a computer named BlakeLaptop, which has a share named Files that shares documents held in the c:\My Documents\Files directory. You can map z: onto this share by using net use z: \\BlakeLaptop\Files, and then z:\myfile.txt and c:\My Documents\Files\myfile.txt will point to the same file.

You can access a file directly by using its UNC name rather than by mapping to a drive first. For example, \\BlakeLaptop\Files\myfile.txt is the same as z:\myfile.txt. Also, you can combine SMB with a variation of the \\?\ format for example, \\?\UNC\BlakeLaptop\Files is the same as \\BlakeLaptop\Files.

Be aware that Windows XP includes a Web-based Distributed Authoring and Versioning (WebDAV) redirector, which allows the user to map a Web-based virtual directory to a local drive by using the Add Network Place Wizard. This means that redirected network drives can reside on a Web server, not just on a file server.