Reading the Source Data | Hard Disk Data Acquisition

Using the general acquisition theory that was previously described, there are two major parts of the process. First, we need to read data from a source, and then we need to write it to the destination. Because this book focuses on the analysis of volume and file system data, we are going to cover the process of acquiring at the disk level (because that is where the volume data structures are located). This section examines the issues associated with reading a disk, and the next major section examines the issues associated with writing to a destination. For this section, we assume that a typical IA32 system (such as x86/i386) is being used for the acquisition, and we will discuss how to access the data, handle errors, and reduce the risk of writing data to the suspect drive.

Direct versus BIOS Access

As we saw in Chapter 2, "Computer Foundations," there are two methods in which the data on a disk can be accessed. In one method, the operating system or acquisition software accesses the hard disk directly, which requires that the software know the hardware details. In the second method, the operating system or acquisition software accesses the hard disk through the Basic Input/Output System (BIOS), which should know all the hardware details. At a casual glance, there do not seem to be many differences between these methods, and using the BIOS seems easier because it takes care of the hardware details. Unfortunately, it is not that straightforward when it comes to doing an investigation.

When the BIOS is used, there is a risk that it may return incorrect information about the disk. If the BIOS thinks that a disk is 8GB, but the disk is really 12GB, the INT13h functions will give you access to only the first 8GB. Therefore, if you are doing an acquisition of the disk, you will not copy the final 4GB. We can see this in Figure 3.1, where two applications are trying to identify the size of a disk using different methods.

Figure 3.1. Two applications are trying to determine the size of a disk. The BIOS is not properly configured and says that the 12GB disk is only 8GB.

This scenario can happen in a couple of different ways. One case is when the BIOS is configured for a specific hard disk geometry that is different from the one installed. In another case, an acquisition tool uses a legacy method of requesting the size of the disk. There are two ways that an application can ask the BIOS for a disk size. One is through the original INT13h function that suffers from the 8GB limit and returns the size using the disk's geometry in CHS format. The second method is to use an extended INT13h function that returns the size in LBA format. The CFTT group at NIST had a 2GB disk and a computer where two different sizes were returned from the INT13h and the extended INT13h functions. The extended INT13h result was correct, but the legacy INT13h result was too small [U.S. Department of Justice 2003].

Occasionally, an e-mail is sent to one to the digital forensic e-mail lists from someone who acquired a disk using two different tools and got different sized images. The reason is usually because one of the tools used the BIOS and the other did not. Make sure that you know how your acquisition tools access the disk, and if the tool uses the BIOS, make sure it reports the full disk before you acquire the disk. The BIOS adds one more location where an error can be introduced into the final image, and it should be avoided if better alternatives exist.

Dead Versus Live Acquisition

An investigator has the choice of performing a dead or a live acquisition of data. A dead acquisition occurs when the data from a suspect system is being copied without the assistance of the suspect operating system. Historically, the term dead refers to the state of only the operating system, so a dead acquisition can use the hardware from the suspect system as long as it is booted from a trusted CD or floppy. A live acquisition is one where the suspect operating system is still running and being used to copy data.

The risk of conducting a live acquisition is that the attacker has modified the operating system or other software to provide false data during the acquisition. To provide an analogy to the physical world, imagine the police arriving at a crime scene where there are several people and it is unknown whether any were involved in the crime. A little while later, the police are looking for a certain object, and they ask one of these unknown people to go into one of the rooms and look for the object. The person comes back to the officer and says that he could not find the object, but should the officer trust him? Maybe this person was involved in the crime, and the object was in the room, but he destroyed it when he was sent in to look for it.

Attackers frequently install tools called rootkits into systems that they compromise, and they return false information to a user [Skoudis and Zeltser 2004]. The rootkits hide certain files in a directory or hide running processes. Typically, the attackers hide the files that they installed after compromising the system. An attacker could also modify the operating system so that it replaces data in certain sectors of the disk while it is being acquired. The resulting image might not have any evidence of the incident because it was replaced. When possible, live acquisition should be avoided so that all evidence can be reliably collected.

It is common for an investigator to boot a suspect system using a trusted DOS floppy or Linux CD that has been configured to not mount drives or modify any data. Technically, it is possible for the suspect to have modified their hardware so that it returns false data even with a trusted operating system, but that is much less likely than the operating system being tampered with.

Error Handling

When an acquisition tool is reading data from a disk, it needs to be capable of handling errors. The errors could be caused by a physical problem where the entire drive no longer works, or the errors could be in a limited number of sectors. If only a limited number of sectors is damaged, a normal acquisition can occur, provided that the acquisition tool properly handles the errors.

The generally accepted behavior for dealing with a bad sector is to log its address and write 0s for the data that could not be read. Writing 0s keeps the other data in its correct location. If the sector were ignored instead of writing 0s, the resulting copy would be too small, and most analysis tools would not work. Figure 3.2 shows a series of values that are being acquired. Three of the values have errors and cannot be read, so 0s are written to the copy.

Figure 3.2. The original has three errors in it that have been replaced by 0s.

Host Protected Area

When acquiring data from an ATA disk, you should pay attention to the Host Protected Area (HPA) of the disk because it could contain hidden data. Unless an acquisition tool looks for an HPA, it will not be acquired. Refer to Chapter 2 for more information about HPAs.

A tool can detect an HPA by comparing the output of two ATA commands. The READ_NATIVE_MAX_ADDRESS command gives the total number of sectors on the disk, and the IDENTIFY_DEVICE returns the total number of sectors that a user can access. If an HPA exists, these two values will be different.

If you do not have access to a tool that will execute the necessary ATA commands, you may have to compare the number of sectors that are copied during an acquisition with the number of sectors that is documented on the label of the disk. Many of the current acquisition tools on the market will detect an HPA, and there are also specialized tools such as BXDR (http://www.sandersonforensics.co.uk/BXDR.htm) by Paul Sanderson, diskstat in The Sleuth Kit, DRIVEID by MyKey Technology (http://www.mykeytech.com), and hpa by Dan Mares (http://www.dmares.com/maresware/gk.htm#HPA).

If you encounter a disk with an HPA and you want to gain access to the hidden data, you will need to change the disk configuration. An HPA is removed by setting the maximum user addressable sector to be the maximum sector on the disk. This can be done using the volatility bit such that the configuration change will be lost when the hard disk is powered off. This command may be blocked by some hardware write blockers, which will be discussed later in this chapter.

The process of removing an HPA involves changing the disk configuration. There is an extremely rare possibility that the disk controller or acquisition tool has not properly implemented HPA changes, and data could be lost. Therefore, you might consider imaging the disk with the HPA before you remove it. If the removal process causes any damage, you still have the original image to analyze. We will see an example of a disk with an HPA in the dd case study later in this chapter. If you need to remove an HPA, it should be documented in your notes.

Device Configuration Overlay

When acquiring data from a newer ATA disk, you should look for a Device Configuration Overlay (DCO), which could cause the disk to look smaller than it really is. A DCO is similar to an HPA, and they can both exist at the same time. DCOs were discussed in Chapter 2.

A DCO is detected by comparing the output of two ATA commands. The READ_NATIVE_MAX_ADDRESS command returns the maximum sector of the disk that normal ATA commands have access to, and the DEVICE_CONFIGURATION_IDENTIFY command returns the actual physical number of sectors. If these are different, a DCO exists and needs to be removed if all data are going to be acquired.

To remove a DCO, the disk configuration must be changed using the DEVICE_CONFIGURATION_SET or DEVICE_CONFIGURATION_RESET commands. Both of these changes are permanent and will not be revoked at the next reset as is possible with HPA. Currently, there are few tools that detect and remove DCO. The Image MASSter Solo 2 from ICS (http://www.icsforensic.com) will copy the sectors hidden by a DCO. As with HPA, it is safest to make a copy of the drive with the DCO in place and then remove it and make a second copy. When you remove a DCO, be sure to document the process. Also test whether your hardware write blockers allow the DCO to be removed.

Hardware Write Blockers

One of the investigation guidelines that we discussed in Chapter 1 was to modify the original data as little as possible. There are many acquisition techniques that do not modify any of the original data, but mistakes can happen. Further, there are also some acquisition techniques that can modify the original data, and we may want to prevent that.

A hardware write protector is a device that sits in the connection between a computer and a storage device. It monitors the commands that are being issued and prevents the computer from writing data to the storage device. Write blockers support many storage interfaces, such as ATA, SCSI, Firewire (IEEE 1394), USB, or Serial ATA. These devices are especially important when using an operating system that could mount the original disk, such as Microsoft Windows.

We discussed ATA commands in Chapter 2 and saw that a disk should not perform any actions until its command register is written to. So, in theory, the most basic type of ATA hardware write blocker is a device that prevents the controller from writing any values to the command register that could cause data to be written to or erased from the disk. However, such a device might allow the controller to write data into other registers. This is analogous to being able to load a gun, but not being able to pull the trigger. We can see in Figure 3.3 that read commands are passed to the disk, but write commands are not.

Figure 3.3. The read request for sector 5 is passed through the write blocker, but the write command for the same sector is blocked before it reaches the disk.

The NoWrite device by MyKey Technologies has a more advanced design and works as a state-based proxy between the controller and hard disk [MyKey Technology 2003]. It does not send any data or command to the hard disk until it knows that it is a safe command. Therefore, the command arguments are not written to the registers until the NoWrite device knows what command they are for. This makes the data transfers slower, but it is easier to show that no dangerous commands were written. Using the previous gun analogy, this process checks each bullet and allows only blanks to be loaded.

I mentioned hardware write blockers in the previous HPA and DCO sections and want to readdress those points. To remove an HPA or DCO, commands are sent to the disk. These commands modify the device and should be stopped by hardware write blockers. The NoWrite device makes an exception and allows the SET_MAX command to be executed if the volatile bit is set such that the change is not permanent. All other SET_MAX and DEVICE_CONFIGURATION commands are blocked. Other write blockers may choose to allow all these commands to pass, and others may block them all. At the time of this writing, there is little documentation on which commands are being blocked, so you should check with your vendor and conduct your own tests.

Like all investigation tools, testing of hardware write blockers is important, and the CFTT group at NIST has published a specification for hardware write blockers (http://www.cftt.nist.gov/hardware_write_block.htm). The specification classifies the ATA commands as non-modifying, modifying, and configuration. The specification states that modifying commands must be blocked and optionally return success or failure.

Software Write Blockers

In addition to hardware write blockers, there are also software write blockers. At one point, most digital forensic tools were DOS-based and used the INT13h method to access a disk. Software write blockers were frequently used to prevent the disk from being modified during the acquisition and examination. In this section, we will describe how they work and what their limitations are.

The software write blockers work by modifying the interrupt table, which is used to locate the code for a given BIOS service. The interrupt table has an entry for every service that the BIOS provides, and each entry contains the address where the service code can be found. For example, the entry for INT13h will point to the code that will write or read data to or from the disk.

A software write blocker modifies the interrupt table so that the table entry for interrupt 0x13 contains the address of the write blocker code instead of the BIOS code. When the operating system calls INT13h, the write blocker code is executed and examines which function is being requested. Figure 3.4 shows an example where the software write block has been installed and blocks a write command. A write blocker allows a non-write function to execute by passing the request directly to the original INT13h BIOS code.

Figure 3.4. A BIOS interrupt table without a write block installed and with a software write block installed that prevents writes from being executed.

Software write blockers are not as effective as hardware blockers because software can still bypass the BIOS and write data directly do the controller, and the BIOS can still write data to the disk because it has direct access to the controller. In general, if you want to control access to a device, you should place the controls as close to the device as possible. The hardware write blockers are as close to the hard disk as possible, on the ribbon cable.

The CFTT group at NIST has developed requirements and has tested software write block devices. The details can be found on their Web site (http://www.cftt.nist.gov/software_write_block.htm).

Part I: Foundations

Digital Investigation Foundations

Computer Foundations

Hard Disk Data Acquisition

Part II: Volume Analysis

Volume Analysis

PC-based Partitions

Server-based Partitions

Multiple Disk Volumes

Part III: File System Analysis

File System Analysis

FAT Concepts and Analysis

FAT Data Structures

NTFS Concepts

NTFS Analysis

NTFS Data Structures

Ext2 and Ext3 Concepts and Analysis

Ext2 and Ext3 Data Structures

UFS1 and UFS2 Concepts and Analysis

UFS1 and UFS2 Data Structures

Summary

Summary

Bibliography

Bibliography

Bibliography

Bibliography