UNIX applications have to be very careful when interacting with the file system, because of the danger of race conditions. Race conditions, in general, are situations in which two different parties simultaneously try to operate on the same resource with deleterious consequences. In the context of security flaws, attackers try to manipulate the resource out from underneath the victim. For UNIX file system code, these issues usually occur when you have a process that gets preempted or enters a blocking system call at an inopportune moment. This inopportune moment is typically somewhere in the middle of a sensitive multiple-step operation involving file and directory manipulation. If another process wins the race and gets scheduled at the right time in the middle of this "window of inopportunity," it can often subvert a vulnerable nonatomic sequence of file operations and wrest privileges from the application. Listing 9-3 shows an example. Listing 9-3. Race Condition in access() and open()
This code represents a setuid root program opening the /tmp/userfile file, which can be controlled by users. It uses the access() function to make sure users running the program have permission to read from the /tmp/userfile file. access() is specially designed for setuid programs; it performs the privilege check by using the process's real user ID rather than the effective user ID. For a setuid root program, this is typically the user that ran the executable. If users don't have permission to read /tmp/userfile, the program exits. This call to access() protects the program from following a symbolic link at /tmp/userfile and opening a sensitive file or from opening a hard link to a sensitive file. The problem is that attackers can alter /tmp/userfile after the access() check but before opening the file. Figure 9-7 outlines this attack. Say attackers create an innocuous regular file named /tmp/userfile. They let the preceding code do its access check and come back with a clean result. Then the process gets swapped out, and a process controlled by attackers runs. This evil process can unlink /tmp/userfile and replace it with a symbolic link to /etc/shadow. When the privileged program resumes, it does open("/tmp/userfile", O_RDONLY), which causes it to follow the symbolic link to /etc/shadow. The privileged program then reads in the shadow password file, which likely leads to an exposure of sensitive information later on. Figure 9-7. Program flow for Listing 9-3
Auditing Tip The access() function usually indicates a race condition because the file it checks can often be altered before it's actually used. The stat() function has a similar problem. TOCTOUThe concept of exploiting the discrepancy between a security check on a resource and the use of a resource is known as a time of check to time of use (TOCTOU or TOCTTOU) issue. This concept doesn't apply to just file manipulation. Any time that the state of a resource can change in between when an access check is done and when an action is performed on it creates an opportunity for TOCTOU attacks. If you refer to Figure 9-7, you can see the time of check and time of use labeled for clarity. It might seem unrealistic that a program could get swapped out at the exact moment for attackers to take advantage of this "window of inopportunity." Remember that attackers are determined and resourceful, and it's usually safe to bet they can find some way to exploit even an improbable vulnerability. In the scenario depicted in Figure 9-7, attackers could take action in the background to try to slow down the system, such as a network-intensive flood of data or heavy use of the file system. They could also send job control signals to the setuid root program that is performing the potentially dangerous file operations to stop and start it constantly in a tight loop. Depending on the file system, they might be able to watch for access times on files that are being updated or even watch the progress of the setuid program through system-specific interfaces. There's plenty of system-specific functionality that can be leveraged with some creativity. For example, Linux 2.4 and later has a flag that can be used with the fcntl() function, F_NOTIFY, that causes a signal to be delivered to your program when certain actions occur in a directory. Several advanced race condition exploits for Linux make use of this flag. The stat() Family of FunctionsMany of the TOCTOU examples you encounter feature the use of stat() or one of its variations. These functions are designed to give the caller extensive information about a file. The three primary functions that return this information are stat(), lstat(), and fstat(). The stat() function has the following prototype: int stat(const char *pathname, struct stat *buf); The pathname parameter specifies the file to be checked and the buf parameter points to a structure that's filled in with file information. lstat() works similarly, except, as noted in "Symbolic Links," if pathname is a symbolic link, information is returned about the link rather than the link's target. Finally, there is fstat(), which takes a file descriptor rather than a pathname. Of these functions, fstat() is the most resilient function in terms of race conditions, as it's operating on an previously opened file. The information returned in the stat structure includes most of the statistics about a file that might be useful to developers. Information returned includes, but is not limited to, the owner of the file, the owning group of the file, the number of hard links to the file, and the type of the file. By examining the type of the file, it is possible to use these functions to determine whether a file is really a regular file, a link file, a device file, and so on. The following macros are defined for testing the file type:
As you have probably guessed, a standard method for protecting against link-based attacks is to use lstat() on a requested filename and either explicitly check if it's a link, or check if it's a regular file and fail if it is not. Say a privileged program wants to work with a file but wants to make sure it isn't going to be tricked into following a symbolic link. Listing 9-4 shows some code from the Kerberos 4 library that's used by a kerberized login daemon. Listing 9-4. Race Condition from Kerberos 4 in lstat() and open()
This code uses lstat() to check whether the file is a symbolic link. If it isn't, the program knows it's safe to open the file. However, what happens if attackers replace the file with a symbolic link after the lstat() call but before the open() call? It causes a TOCTOU situation. The potential attack is shown in Figure 9-8. In this vulnerability, attackers are able to overwrite arbitrary files as root when the kerberized login daemon creates new tickets. (Note that this code is also vulnerable to a hard link attack because it doesn't check the link count lstat() returns.) Figure 9-8. Program flow for Listing 9-4
Note that it's possible to have a race condition if you do things in the opposite order, with the check coming after the use, as shown in Listing 9-5. Listing 9-5. Race Condition in open() and lstat()
It might seem as though this program isn't susceptible to a race condition because it opens the file first, and then checks whether it's valid. However, it suffers from a similar problem. Attackers can create the malicious symbolic link the program opens, and then delete or rename that symbolic link and create a normal file with the same name. If they get the timing right, lstat() operates on the normal file, and the security check is passed. The kernel doesn't care if the file that fd indexes has been deleted or renamed. As long as the file descriptor is kept open, the file and its corresponding inode in the file system stay available. This process is shown in Figure 9-9. Figure 9-9. Program flow for Listing 9-5
Here's another example of a race condition from an old version of the SunOS binmail program, discovered by a rather clever hacker group known as "8 Little Green Men," or 8lgm for short. Binmail runs as root and is used to deliver mail to local users on the system. This local mail delivery is performed by opening the user's mail spool file in a public sticky directory and appending the new mail to that file. The following code is used to open the mail spool file: if (!(created = lstat(path, &sb)) && (sb.st_nlink != 1 || S_ISLNK(sb.st_mode))) { err(NOTFATAL, "%s: linked file", path); return(1); } if ((mbfd = open(path, O_APPEND|O_WRONLY|O_EXLOCK, S_IRUSR|S_IWUSR)) < 0) { if ((mbfd = open(path, O_APPEND|O_CREAT|O_WRONLY|O_EXLOCK, S_IRUSR|S_IWUSR)) < 0) { err(NOTFATAL, "%s: %s", path, strerror(errno)); return(1); } } This program first checks to see whether the mail spool is a symbolic link or a hard link by performing an lstat(). If the file doesn't exist or looks like a normal file, binmail attempts to open the file for appending. If the open fails, binmail attempts to open the file again, but it tells the OS to create the file if it doesn't exist. The problem is the race condition between the lstat() call and the open() call. Attackers can place an innocuous file there or delete the mail spool, wait for the lstat() to occur, and then place a symbolic link or hard link pointing to a sensitive file. The mail sent to that user is appended to the sensitive file, if it exists; if it doesn't, it's created as root and written to. Furthermore, a symbolic link pointing to a target file that isn't present can be used to have binmail create an arbitrary file as root. (This bug is documented in a bugtraq post by 8lgm, archived at http://seclists.org/bugtraq/1994/Mar/0025.html.) File Race ReduxMost file system race conditions can be traced back to using system calls that work with pathnames. As discussed, every time a system call takes a pathname argument, the kernel resolves that pathname to an inode by traversing through the relevant directory entries. So if you have this code: stat("/tmp/bob", &sb); stat("/tmp/bob", &sb); The first call to stat() causes the kernel to look up the inode for the /tmp/bob pathname, open that inode, and collect the relevant information. The second time stat() is called, the same thing happens all over again. If someone changes /, /tmp, or /tmp/bob between the two stat() calls, the system could easily end up looking at two different files. Now take a look at this code: fd=open("/tmp/bob", O_RDWR); fstat(fd, &sb); fstat(fd, &sb); The call to open() resolves the /tmp/bob pathname to an inode. It then loads this inode into kernel memory, creates the required data structures to track an open file, and places a pointer to them in the process's file descriptor table. The call to fstat() simply takes the file descriptor index fd, looks in the table and pulls out the pointer, and ends up looking directly at the data structure encapsulating the inode. The second fstat() does the same thing as the first one. If someone unlinked /tmp/bob in the middle of the fstat() calls, it wouldn't matter because the file descriptor would still reference the inode on the disk that was /tmp/bob when open() was called. That inode isn't deallocated until its reference count goes away, which doesn't happen until the process uses close(fd). Renaming and moving the file doesn't change the target of fstat(), either. The permissions are established by how the file is opened and the security checks occurring at the time it's opened, so even if the file is marked with permission bits 0000, it doesn't matter to the process after it has successfully opened the file for reading. Pathnames Versus File DescriptorsThe basic difference between pathnames and file descriptors is in how they're used by functions. Functions that take pathnames are looking up which file to work with each time they're called. Functions that work with file descriptors are going straight to the same inode that was opened initially. Any time you see multiple system calls that use a file path, it's worth considering what would happen if the file was changed in between those calls. Remember that changing any directory component between the starting directory and the target file can potentially disrupt a process's intended file actions. In general, if you see anything besides a single filename-based system call to open a resource followed by multiple file-descriptor-based calls, there's a reasonable chance of a race condition occurring. Evading File Access ChecksOne basic pattern to look for is a security check function that uses a filename followed by a usage function that uses a filename. The basic vulnerability pattern is the file being checked using something like stat(), lstat(), or access(), and, providing that the check succeeds using something like open(), fopen(), chmod(), chgrp(), chown(), unlink(), rename(), link(), or symlink(). In general, the safe form of a security check involves checks and usage on a file descriptor. It's guaranteed that a file descriptor, after the kernel creates it, refers to the same file system object for the duration of its lifetime. Therefore, functions that work with a file descriptor can often be used in a safe fashion when their filename counterparts can't. For example, fstat(), fchmod(), and fchown() can be used to query or modify a file that has already been opened safely, but the corresponding stat(), chmod(), and chown() functions might be susceptible to race conditions if the file is tampered with right after it has been opened. Permission RacesSometimes an application will temporarily expose a file to potential modification for a short window of time by creating it with insufficient permissions. If attackers can open that file during this window, they get an open file handle to the file that locks in the insufficient permissions, and lets them retain access to the file after the permissions have been corrected, as shown in this example: FILE *fp; int fd; if (!(fp=fopen(myfile, "w+"))) die("fopen"); /* we'll use fchmod() to prevent a race condition */ fd=fileno(fp); /* lets modify the permissions */ if (fchmod(fd, 0600)==-1) die("fchmod"); This code excerpt opens a file for reading and writing by using the fopen() function. If the file doesn't already exist, it's created by the call to fopen(), and the umask value of the process determines its initial file permissions. This will be discussed in more detail in "The Stdio File Interface," but the important detail that need to know for now is that fopen() calls open() with a permission argument of octal 0666. Therefore, if the process's umask doesn't take away world write permissions, any user on the file system is able to write to the file. The program immediately changes its file to mode 0600, but it's too latea race condition has already occurred. If another process can use open() on the file requesting read and write access, immediately after it's created but before its permission bits are changed, that process has a file descriptor open to the file with read and write permissions. Ownership RacesIf a file is created with the effective privileges of a nonprivileged user, and the file owner is later changed to that of a privileged user, a potential race condition exists, as shown in this example: drop_privs(); if ((fd=open(myfile, O_RDWR | O_CREAT | O_EXCL, 0600))<0) die("open"); regain_privs(); /* take ownership of the file */ if (fchown(fd, geteuid(), getegid())==-1) die("fchown"); This code is similar to the permission race code you examined previously. A privileged application temporarily drops its privileges to create a file safely. After the file is created, it wants to set file ownership to root. To do this, the program regains its root privileges and then changes the file's ownership with the fchown() system call. The vulnerability is that if unprivileged users manage to open the file between the call to open() and the call to fchown(), they get a file descriptor with a file access mask permitting read and write access to the file. Directory RacesPrograms that traverse through directories in the file system have to be careful about trusting the integrity of the directory hierarchy. If a program descends into user-controllable directories, users can often move directories around in devious ways from under the program and cause it to operate on sensitive files inadvertently. CaveatsIf a program attempts to recurse through directories, it needs to account for infinitely recursive symbolic links. The kernel notices infinite symbolic links as it resolves a pathname, and it returns an error in the case of too much recursion. If a program attempts to traverse a path itself, it might need to replicate the logic the kernel uses to avoid ending up in an infinite loop. Another possible point of confusion that you need to be aware of is that symbolically linked directories are not reflected in pathnames returned by system calls that retrieve a current path. If you're using a command shell and issue cd to change to a directory that's a symbolic link, typing pwd reflects that symbolic link. However, from the kernel's perspective, you're in the actual target directory, and any system call to return your current path doesn't include the symbolic link. If a symbolic link named /bob points to the /tmp/bobshouse directory, and you change your current directory to /bob, the getcwd() function reports your current directory to you as /tmp/bobshouse, not /bob. Directory Symlinks for Exploiting unlink()It's important to consider the effects of malicious users manipulating directories that are one or two levels higher than a process's working space. Wojciech Purczynski discovered a vulnerability in the Solaris implementation of the UNIX job-scheduling at command. The -r argument to at tells the program to delete a particular job ID. According to Wojciech, at had roughly the following logic: logic for /usr/bin/at -r JOBNAME /* chdir into at spool directory */ chdir("/var/spool/cron/atjobs") /* check to make sure that the file is owned by the user */ stat64(JOBNAME, &statbuf) if (statbuf.st_uid != getuid()) exit(1); /* unlink the file */ unlink("JOBNAME") The at command changes to the atjobs spool directory, and if users own the file corresponding to the job they specify, the job file is deleted. The first vulnerability in at is that the job name can contain ../ path components. So attackers could use the following command: at -r ../../../../../../tmp/somefile The at command would delete /tmp/somefile, but only if somefile is owned by the user. So you can use it to delete files you own, which isn't all that interesting. However, there's a race condition between the call to stat() and the call to unlink() in the code. Keep in mind that unlink() doesn't follow symbolic links on the last directory component. So if you use the normal attack of putting a normal file for stat() to see, deleting it, and placing a symlink to the sensitive file, the unlink() call would just delete the symbolic link and not care what it pointed to. The trick to exploiting this code is to remember that unlink() follows symbolic links in directory components other than the last component. This attack is shown in Figure 9-10. Figure 9-10. Attacking the Solaris at command
First, attackers create a /tmp/bob directory, and in that directory create a normal file called shadow. The attackers let at run and perform the stat() check on the /tmp/bob/shadow file. The stat() check succeeds because it sees a normal file owned by the correct user. Then attackers delete the /tmp/bob/shadow file and the /tmp/bob directory. Next, they create a symbolic link so that /tmp/bob points to /etc. The at command proceeds to unlink /tmp/bob/shadow, which ends up unlinking /etc/shadow and potentially bringing down the machine. Moving Directories Underneath a ProgramWojciech Purczynski also discovered an interesting vulnerability in the GNU file utils package. The code is a bit complicated, so the easiest way to show the issue is show the program's behavior at a system call trace level. The following code is based on his advisory (archived at http://seclists.org/bugtraq/2002/Mar/0160.html): Example of 'rm -fr /tmp/a' removing '/tmp/a/b/c' directory tree: (strace output simplified for better readability) chdir("/tmp/a") = 0 chdir("b") = 0 chdir("c") = 0 chdir("..") = 0 rmdir("c") = 0 chdir("..") = 0 rmdir("b") = 0 fchdir(3) = 0 rmdir("/tmp/a") = 0 If you have a directory tree of /tmp/a/b/c, and you tell rm to recursively delete /tmp/a, it basically recurses into the deepest directory /tmp/a/b/c, and then uses chdir("..") and removes c. The rm program then uses chdir("..") to back up one more directory and delete b. Next, it uses fchdir() to go back to the original starting directory and delete /tmp/a. Wojciech's attack is quite clever. Say you let the program get all the way into the c directory, so it has a current working directory of /tmp/a/b/c. You can modify the directory structure before rm uses chdir(".."). If you move the c directory so that it's underneath /tmp, the rm program is suddenly in the /tmp/c directory instead of /tmp/a/b/c. From this point, it recurses upward too far and starts recursively removing every file on the system. Note Nick Cleaton discovered similar race conditions in the fts library (documented at http://security.freebsd.org/advisories/FreeBSD-SA-01:40.fts.asc), which is used to traverse through file systems on BSD UNIX derivatives. He's quite clever, too, even though he's not Polish. |