The Sleuth Kit (TSK) and the Autopsy Forensic Browser are open source Unix-based tools that I first released (in some form) in early 2001. TSK is a collection of over 20 command line tools that can analyze disk and file system images for evidence. To make the analysis easier, the Autopsy Forensic Browser can be used. Autopsy is a front end to the TSK tools and provides a point-and-click type of interface.
This appendix gives more details about TSK and Autopsy. TSK is used throughout this book in the examples, but this is the only place that describes how you can use it. Both Autopsy and TSK can be downloaded for free from http://www.sleuthkit.org.
The Web site also contains information about e-mail lists for tool users and developers and the bi-monthly "Sleuth Kit Informer" newsletter, which contains articles on using TSK, Autopsy, and other open source investigation tools.
The Sleuth Kit
TSK contains over 20 command line tools, which are organized into groups. The groups include disk tools, volume tools, file system tools, and searching tools. The file system tools are further organized into the data categories that we discussed in Chapter 8, "File System Analysis." Each tool name has two parts, where the first part identifies its group and the second part identifies its function. For example, fls is a file name category tool (the f) that lists (the ls), and the istat tool is in the metadata category (the i) that displays statistics (the stat).
This section gives an overview to each of the tools in TSK. At the time of this writing, the current version is 1.73, but there are plans for big changes in a 2.00 release. Those changes are not included in this description, but 2.00 could be available by the time you read this. We will start from the bottom and work our way up. Not all option flags are listed here. Refer to the man pages or the website for more details.
There is only one disk tool in TSK, which is the diskstat tool. diskstat currently runs only on Linux, and it gives the statistics about an ATA hard disk. diskstat was used in Chapter 3, "Hard Disk Acquisition," when we looked for Host Protected Areas (HPA) before acquiring a disk. The tool displays the total number of sectors and the user-accessible sectors, which show if an HPA exists. Refer to "A Case Study Using dd" in Chapter 3 for a specific example.
Volume System Tools
The contents of a disk are organized into volumes, and TSK includes one tool that will list the partition layout of a volume. The mmls was used in Chapters 5, "PC-based Partitions," and 6, "Server-based Partitions," of this book, and it supports DOS ( dos), Apple (mac), BSD (bsd), Sun (sun), and GPT (gpt) partitions. The type of partition table can be specified on the command line using the -t argument and the type, which are given in this paragraph in parentheses.
The output of mmls is sorted by the starting address of the partition, regardless of where it is located in the table. It also shows you which sectors in the volume are not allocated to a partition. Refer to any of the specific partition types in Chapters 5 and 6 for examples.
File System Tools
Inside most volumes is a file system, and the bulk of TSK is in the file system layer. The file system tools in TSK are based on the tools from The Coroner's Toolkit (TCT) (http://www.porcupine.org), which is by Dan Farmer and Wietse Venema. There are currently 13 tools in the file system layer tools, and they are organized into five categories. The tools currently require a raw partition image as input, but version 2.00 will support disk images.
The file system tools support Ext2/3 (linux-ext2, linux-ext3), FAT (fat, fat12, fat16, fat32), NTFS (ntfs), and UFS1/2 (freebsd, netbsd, openbsd, solaris) file system formats. They also support raw and swap images to view individual pages. The file system type must be specified with the -f flag and one of the types given previously in parentheses.
File System Category
The file system category of data includes the data that describes the layout and general information about a file system. This data can be displayed by using the fsstat tool, which will read the boot sector or superblock and other data structures that are specific to the different types of file systems. The type of data in the output of fsstat is different for each file system because different types of data are available. Refer to the "File System Category" sections of Chapters 9, "FAT Concepts and Analysis," 12, "NTFS Analysis," 14, "Ext2 and Ext3 Concepts and Analysis," and 16, "UFS1 and UFS2 Concepts and Analysis," for specific outputs.
The content category of data includes the file and directory content. Typically, the content category includes equal-sized data units that are allocated for files and directories. All TSK tools in this category start with the letter d.
The dls tool lists the contents of data units, and by default it outputs the contents of all unallocated data units. The -e flag can be used to output all data units, which is the same as using dd on the image. You also can use the -l flag to list the allocation status instead of outputting the actual contents. For example, the next example lists the allocation status of each data unit in an NTFS image:
# dls f ntfs e l ntfs-10.dd addr|alloc 0|a 1|a [REMOVED] 13423|a 13424|f
The 'a' after each address signals that the data unit is allocated, and an 'f' signals that it is unallocated. The next example will extract all unallocated space of the NTFS image:
# dls f ntfs ntfs-10.dd > ntfs-10.dls
The resulting file will have no structure to it because it simply contains random data units from the file system. If you search the file and find evidence, you can determine from where it originally came by using the dcalc tool. dcalc will calculate the original data unit address by using the data unit address from the unallocated data. For example, if our NTFS file system had 4,096-byte clusters and we found evidence in the 123rd cluster in the unallocated data file, we would supply 123 with the -u flag:
# dcalc f ntfs u 123 ntfs-10.dd 15945
We also can determine the allocation status of a specific data unit by using the dstat tool. dstat also will display the block or cylinder group information for UFS and Ext2/3 file systems.
# dstat -f linux-ext3 ext3-5.dd 23456 Block: 23456 Not Allocated Group: 2
Lastly, we can view the contents of any data unit using the dcat tool. For example, we can view the contents of data unit 23,456 in our Ext3 image by using the following:
# dcat f linux-ext3 ext3-5.dd 23456
The metadata category includes the data that describe a file. Here you will find the data unit addresses that a file has allocated, the size of the file, and temporal information. The types of data in this category vary depending on the file system type. There are four TSK tools in this category, and the names all start with i.
We can get the details about a specific metadata entry by using the istat tool. The output will show the size and temporal data as well as any permissions fields. The addresses of all allocated data units also will be shown. When run on an NTFS image, it will show all the file's attributes. Example output of this tool was given in Chapters 9, 12, 14, and 16.
We also can list the details of several metadata structures by using the ils tool. By default, ils will show only unallocated metadata entries, but all of them can be shown with -e. Listing the unallocated entries is useful to find the entries from deleted files where the file name has been reallocated.
# ils f ntfs e ntfs10.dd 0|a|0|0|1089795287|1089795287|1089795287|100555|1|24755200|0|0 1|a|0|0|1089795287|1089795287|1089795287|100555|1|4096|0|0 [REMOVED] 255|a|256|0|998568000|1100132856|1089795731|100777|1|15360|0|0 256|f|256|0|1100132871|1100132871|1100132871|100777|1|256|0|0
The output was designed so that it can be processed by another tool, and it is frequently used with the mactime tool to make timelines of file activity. If we find a data unit with interesting evidence, we can search all the metadata entries using the ifind tool with the -d flag. Similarly, if we want to find the metadata entry that a specific file name points to, we can use ifind with the -n flag. In the following example, we find that NTFS cluster 3,456 has been allocated by the $DATA attribute of MFT entry 18,080.
# ifind -f ntfs -d 3456 ntfs10.dd 18080-128-3
Lastly, we can view the contents of any file based on its metadata address instead of its file name using the icat tool. This is useful for unallocated files that no longer have a name pointing to their metadata entry. We used this command in the NTFS chapters because it stores all data in files.
# icat f ntfs ntfs10.dd 18080
File Name Category
The file name category of data includes the data that associates a name with a metadata entry. Most file systems separate the name and metadata, and the name is located inside of the data units allocated to a directory. There are two TSK tools that operate at the file name layer, and their names start with f.
fls will list the file names in a given directory. It takes the metadata address of the directory as an argument and will list both allocated and unallocated names. The -r flag will cause the tool to recursively analyze directories, and the -l flag will look up the metadata and list the temporal data along with the file name. Examples of this were given in each of the previous file system chapters. Here is an Ext3 image with a directory in inode 69457, which contains a deleted file named file two.dat.
# fls -f linux-ext3 ext3.dd 69457 r/r 69458: abcdefg.txt r/r * 69459: file two.dat d/d 69460: subdir1 r/r 69461: RSTUVWXY
If we want to know which file name corresponds to a given metadata address, the ffind tool can be used. For example:
# ffind -f linux-ext3 ext3.dd 69458 /dir1/abcdefg.txt
The application category of data includes the data that are included in a file system because it is more efficient using normal system files. In TSK, this includes only two tools, which are for the journal in Ext3. The journal records what updates are going to be made to the file system metadata so that a crash can be more quickly recovered from. This was discussed in Chapters 8 and 14.
The jls tool will list the contents of the journal and show which file system blocks are saved in the journal blocks. The contents of a specific journal block can be viewed by using the jcat tool. Here is an example:
# jls f linux-ext3 ext3-6.dd JBlk Descriptrion 0: Superblock (seq: 0) 1: Unallocated Descriptor Block (seq: 41012) 2: Unallocated FS Block 98313 3: Unallocated FS Block 1376258 [REMOVED]
If we are interested in file system block 98,313, we can view the contents of journal block 2 using jcat.
# jcat f linux-ext3 ext3-6.dd 2
There are a few tools that combine the data from the various categories to produce the data sorted in a different order. The first tool is mactime, and it takes temporal data from fls and ils to produce a timeline of file activity. Each line in the output corresponds to a file being accessed or changed somehow, which we discussed in Chapter 8. Here is an example output (which has been reduced so that it will fit the width of the book):
Wed Aug 11 2004 19:31:58 34528 .a. /system32/ntio804.sys 35392 .a. /system32/ntio412.sys [REMOVED] Wed Aug 11 2004 19:33:27 2048 mac /bootstat.dat 1024 mac /system32/config/default.LOG 1024 mac /system32/config/software.LOG Wed Aug 11 2004 19:33:28 262144 ma. /system32/config/SECURITY 262144 ma. /system32/config/default
Another tool that reorders data is the sorter tool, which sorts files based on their content type. The tool runs the file command on each tool and saves the file to a category based on a set of rules. The fls, ils, and icat tools are used to extract the files from the image.
Lastly, there is a hash database tool named hfind, that allows you to quickly lookup a MD5 or SHA-1 hash value from the NIST NSRL or one that you made using md5sum.
# hfind NSRLFile.txt FBF4C1B7ECC0DB33515B00DB987C0474EC3F4B62 FBF4C1B7ECC0DB33515B00DB987C0474EC3F4B62 MOVELIT.GIF
The last major category of tools in TSK is searching tools. This area will be expanded in the 2.00 release. The current version has the sigfind tool, which searches for binary values. This was used in several of the scenarios in Part 3, "File System Analysis," of the book.
Paul Bakker has been working on adding indexed searches to TSK and Autopsy, and that feature will be part of the 2.00 release (http://www.brainspark.nl/). The indexing process makes a tree of the strings in an image so that you can more quickly find the occurrences of specific strings. A more detailed description can be found in "The Sleuth Kit Informer, Issue 16" [Bakker 2004].
Theoretically, you could do an entire investigation using the command line, but it would not be fun. Autopsy was developed to automate the investigation process when TSK is being used, but it does not limit what an investigator can do. You can still use the command line when you need to do something that the interface does not allow.
Autopsy is HTML-based and is basically a Web server that knows how to run tools from TSK and how to parse the output. It does not know anything about file systems or disks, only TSK does. Autopsy can be used for both dead and live analysis.
For a dead analysis, Autopsy provides case management so that you can have multiple hosts per case and each host can have its own time zone and hash database. All actions are logged so that you can keep track of what you analyzed. You also can make notes about evidence that is found. Because Autopsy uses HTTP, you can connect to it from any computer. When you run Autopsy, you provide the IP address of your computer, and it allows you to remotely connect to it. This allows a central repository of images to exist.
For a live analysis, you will need to compile Autopsy and TSK and then burn them to a CD. The CD can be placed in a Unix system that is suspected of being in an incident, and you can analyze the file system contents by connecting to Autopsy from your laptop or other computer. There are several benefits of running Autopsy and TSK during a live analysis, including that they will show files that are hidden by most rootkits and will not modify the A-times of files and directories when you are looking at their contents. As with all types of live analysis, this process relies on the OS for data, which can lie if it has been modified by an attacker.
Autopsy is organized into analysis modes, which are similar to the organization of the TSK tools. The File Analysis mode allows you to list the files and directories in the image and view file contents. The Metadata mode shows all the metadata associated with a specific entry and allows you to view any data unit allocated to the file. The Data Unit mode allows you to view any data unit, similar to a hex editor, and the Keyword Search mode allows you to search for ASCII or Unicode strings. The keyword searching is done as a logical volume search and not a logical file search. Refer to Chapter 8 for more details on the search type differences.
Autopsy also allows you to sort all files based on type and make HTML pages of thumbnails of all pictures. Timelines of file activity can be created, and notes can be added when evidence is found. The notes allow you to more easily return to where the evidence exists. Lastly, there is an event sequencer that allows you to make notes based on temporal data from the evidence, and it sorts the data. For example, you can make event notes for the creation times of evidence files and Intrusion Detection System (IDS) alerts. The notes will be sorted and will help during the Event Reconstruction Phase of the investigation.
An example screen shot of the File Analysis mode is given in Figure A.1.
Figure A.1. Screen shot of Autopsy in File Analysis mode.