Forensic Techniques: An Example Forensic Recovery and Investigation Procedure

This section will take you through an abbreviated mock recovery and investigation using some of the techniques described earlier.

Identifying Our Target

For this process, we will assume that an operator who was logged into the system noticed a hung Secure Shell connection from the system to another host with an unusual IP address. In general, users don't create Secure Shell connections from the system to other locations (normally only from their workstations to the system itself), and despite this standard policy, the configuration of the system does not prevent it and it isn't well monitored for example, by not containing a SSH client and by not allowing the port forwarding (tunneling) options within the SSH server. A host-based firewall may also have prevented this. Regardless, this is an example method of creating a target and a basic scope for the forensic process.

Passive Network Monitoring Using a Network Tap

In this process, we will discuss how to set up a passive Ethernet tap to log and monitor the traffic to and from a target system we wish to investigate. The basic architecture is to configure the LAN switch that the target host is connected to, or the interface(s) you are interested in monitoring, to have the port(s) mirror or span (different terminology is used with different products) onto a free port on the switch that is subsequently connected to an interface on a physically separate packet capture host. The hardware configuration and requirements for the packet capture host may be substantial if the target system is very active on the network and has high-speed network interfaces, such as Gigabit Ethernet. It is important for the processor on the capture host to be fast enough to process the packets coming in and have enough memory and disk space to store the data. Having two Ethernet interfaces on the capture host is critical as well. One interface must be used as the packet capture interface and the other for administration and administrative purposes only. You don't want to risk talking on the same interface on which you are capturing traffic. Many switch vendors won't allow this to happen anyway, as the operating system may still see the traffic you might generate from the packet capture host and it could show up in your packet dumps. A basic diagram of this setup is provided below in Figure 16-2.

The next step is to install the operating system and configure the network interfaces. Many operating systems won't allow the network interface to be brought up if an IP address is not assigned to it. For the packet capture interfacefor example, eth1you can use an IP from the loopback networkfor example, 127.0.0.2. No traffic will be sent on this interface, so the IP address doesn't matter.

Tip 

Another popular technique is to create a special cable for the packet capture interface that does not contain the wiring for the transmit pairs, only the receive pairs, of a typical Ethernet cable. Having a cable for this purpose guarantees that no matter what happens or what port the system is plugged into, you can prove that the traffic was not generated by the packet capture host itself on that interface.

image from book
Figure 16-2: Passive network tap configuration example

Configure the interface and then start the packet capture program to listen to traffic on this interface (we will use eth1, the second Ethernet interface in the system for the following examples). There are several good programs available for this purpose. Examples include the trusty tcpdump and argus, to name a few. Either way, it is important to configure these programs to rotate the capture files they write every so often so the files are more manageable. Rotation based upon file size is the easiest method to configure. The command listing below uses tcpdump to capture traffic, write to a file, and rotate this file when it is reaches a 2GB file size.

 # tcpdump -i eth1 -w /data/captures/target1.pcap -C 2048 <filter> & 
Tip 

Many filesystems have file size limitations (the magic 2GB limit, for example). Be sure to configure your setting for tcpdump or argus to stay within these limits or use a filesystem that does not have a size limitation (or a much higher one). For example, use the Linux Ext-3 versus the Ext-2 filesystem.

Again, make sure you have enough disk space to store the data for several days to preserve the data throughout the duration of the forensic investigation. In our example, if we are receiving 2GB of traffic of day, not an uncommon number, and the investigation will last 30 days, we need a minimum of 60GB of disk space just for the raw dump data. Fortunately, disk space is cheapbuy more.

You will notice the <filter> line on the command listing above. This line is where you would typically enter a Berkeley Packet Filter (BPF) formatted filter expression for filtering the traffic that is captured. Make sure to capture all the traffic you are interested in even if you will filter the display when viewing or reporting on the information at a later date. In general, if your packet capture system can handle the traffic, it is better to capture all the traffic at this point because it may not be clear exactly what you are looking for. Filtering the display or performing secondary processing on the raw data to create smaller files with just the data you are looking for later on is possible, but only if you capture all the data you might be looking for initially. For example, if you determine later on that you are only interested in seeing traffic for certain protocols (port TCP/80, for example) or from certain host IPs, it is then possible with no loss in the data to filter out the rest of the data by using a BPF filter when you read the data back in using tcpdump.

An example command line for reading the data at a later date and filtering out the traffic you are not interested in seeing is provided here:

 # tcpdump -r /data/captures/target1.pcap -nvvvuX \ port 22 and not host 192.168.1.50 15:12:14.906924 IP (tos 0x10, ttl 64, id 758, offset 0, flags [DF], length: 1300) 192.168.1.50.22 > 192.168.1.1.56026: P 1554480:1555728 (1248) ack 13393 win 12096 <nop, nop, timestamp 70965683 773628668>         0x0000: 4510 0514 02f6 4000 4006 cb53 c0a8 7338 E...@.@..S..s8         0x0010: c0a8 7301 0016 dada b8af 4a98 aff3 7574 ..s.......J...ut         0x0020: 8018 2f40 8205 0000 0101 080a 043a d9b3 ../@.........:..         0x0030: 2elc a2fc e4b2 6208 02d7 58e8 f823 1318 ......b...x..#..         0x0040: 020c a949 55e8 d4ac c25d ce09 8ddl 7a99 ...IU....]....z.         0x0050: f896 <more packets ... > 

At this point, you might be asking, what do I do with the data? How do I analyze it? That isn't always clear and it depends heavily on what the goals of the forensic investigation are and what services you have running. There are a lot of free and commercial tools available to process, organize/filter, and display/analyze the packet capture data. An entire book could be written just on this subject. Some of the popular tools we would suggest you investigate, however, are argus, snort, ethereal, tcpflow , IP Traffic Meter, and etherApe. Running argus against one of the packet capture files straightaway may give you some indication as to the typical traffic you receive to/from the hostfor example, identifying the top talkers. A tool like etherApe, which is a graphical tool for the GNOME desktop, displays connections graphically in a circular pattern. This may be more useful as it may visually indicate some unusual connections. The larger the circles/lines between the hosts are, the more traffic is being sent/received, but it will display all the other connections and resolve the hostnames and layer 2 hardware addresses.

Regardless of what you decide to do with the data at this point or how much analysis you choose to perform, having the raw data for investigation later on may be critically important.

You now have an active packet capture of everything that is coming into and leaving the target systems network interface. Make sure not to modify the packet capture files or delete any of them. You would then move on to the preliminary, nonintrusive investigation of the target system (if warranted), as detailed in the "Advanced Digital Forensic Tools" section of this chapter. For this example, we will skip this step for now, as it isn't the most important step for our mock investigation. Next, we will discuss creating a disk image so an offline disk analysis may be performed.

Creating a Disk Image Using dd and Mounting the Filesystem

This section discusses the specifics of performing this process using one general purpose method that works for most operating platforms and is especially useful in cases where a lower level investigation of the data on the hard disk is necessary.

In this process we will discuss a popular method of replicating the data from a hard disk or filesystem using a standard UNIX utility called dd, which stands for disk dump. This is a UNIX command that reads from and writes to device files and regular files, or vice versa, within the operating system. It is a simple program and reads data directly from the file sequentially in a binary block-for-block method and writes the data to a destination file block for block. Of course, there are many options, including the block size and input and output files.

Now it is time to make the disk image. In general, we recommend the use of the Linux operating system for use in forensic analysis, as it has a lot of common forensic (high-level and low-level) tools available and can often auto-detect most disk controllers and partition types/formats, should they need to be mounted at a later point in time. Given the popularity of the Linux operating system for this purpose in recent years , there are now several operating system distributions, some of which are completely bootable and operable from the bootable CD media alone, that are designed for digital forensics and contain collections of forensic tools bundled together all in one place. One popular one we recommend is called Knoppix-STD. Another distribution gaining popularity is INSERT.

Of course, in order to get access to this data, you must physically remove the hard disk drive from the target system and place it within or attach it to the disk subsystem of the recovery workstation. There are methods of streaming the data over the network; however, we have not found any of them to be reliable enough to warrant a specific recommendation and promote as standard practice at this timethese may improve in the future and become more standard. The specifics of doing the physical hardware connections depend upon the hard drive type and format of the systems being used and are outside the scope of this chapter. Once the operating system boots up and detects the hard drive to be replicated, the command that follows is a common command used to make a disk dump backup of the entire hard disk device file, as viewed from the recovery/forensic investigation workstation.

 # dd if=/dev/hda of=/data/disk_images/target1-hdal.dd_image 

Once the command finishes, you will have a complete exact copy of the hard disk (including any partitions or slices) in the file/data/disk_images/target1-hda1.dd_image. Of course, you have to consider the space requirements for storing all of this data. You can repeat this command for individual partitions or slices, if that is necessary. The recovery/investigation workstation will need to have adequate space, in some cases, for multiple copies of this file while it is being analyzed .

There are three major benefits to using this method:

  • Read-only No modification is possible to hard disk being imaged , because the filesystem isn't mounted.

  • Universality It works with any filesystem type or data format, as long as the partition table information is intact and is readable or recognizable by the operating system.

  • Flexibility Individual partition slices or the entire hard disk may be imaged.

You may be asking what happens if the hard disk is damaged, or the partition table cannot be read. Several tools are mentioned in the "Advanced Digital Forensic Tools" section of this chapter, which discusses some specifics about how to read and repair partition tables, and determine the formats and types of partitions when they are not known or are damaged.

Once the disk image is made, the original disk drive may be returned to the target system and all investigation work should proceed using the disk image data only. Of course, if the target system is reactivated, it should be kept offline until the forensic analysis is complete. If you have already determined that you will need to restore the data from backup media, and/or rebuild and restore, that process should commence at this point.

No modifications should be made to the disk image file during the analysis process. Depending upon the processes and steps to be taken during the analysiswhich may require mounting the filesystem and partitions on the recovery workstation to analyze log files and data files, compare checksums, look for root kits, and so onit is important to mount the filesystem read-only to insure that it is not accidentally modified.

Note 

It is common practice among digital forensic scientists to archive a backup image of the filesystem in question at each stage of the investigative process. This provides a series of "beachheads" that may be returned to if necessary.

An example command to mount the disk image as a filesystem using the loopback driver is provided here:

 # mkdir /mnt/target1-hda1 # mount -o loop, ro -t ext3 /data/disk_images/target1-hda1.dd_image \     /mnt/target1-hda1 

From this point, you should be able to see the directory structure of the filesystem on hdal, as it appeared on the target system at the time it was taken offline. Using the -o ro option as provided above, make sure that no modifications may be made to the filesystem through the mount point. Remember, this does not prevent modification to the filesystem image file itself, if it is analyzed through other utilities. If there is any concern about this process, you may want to make a copy of the image file before performing any analysis.

You might be asking, what if I can't mount the filesystem, or what if the data we are interested in was removed or deleted? It may be recoverablea lower-level investigation of the data on the hard drive or filesystem partition may be in order. The next section covers this topic.

Finding or Recovering the Impossible with Foremost

Are you concerned the intruder has deleted the data you are after, and you need to recover it to prove they were there? Has critical data been lost? It may be possible to find this data by going beyond what the filesystem can tell you. The data you're after may still be on the hard disk, but not accessible to the operating system through the filesystem anymore. A lower-level look at the filesystem data itself may reveal just what you're looking for.

Examples noted in this section utilize an application called Foremost, which was originally developed by Jesse Korblum and Kris Kendall at the United States Air Force Office of Special Investigations. Foremost is now maintained publicly in the open source community. Using Foremost, you can search a hard disk, disk image files, or raw data directly. Foremost works on input files, which can be image files, or hard disk device files, and it processes this large quantity of information quickly looking for header and footer information, as specified by the user in a separate configuration file. It can quickly and easily find text, images, password files, word processing documents, and other commonly formatted files. It can easily be extended to find just about any file type or data (portions of files) that you are looking for by extending the configuration and providing your own file format search specifications.

The command listing below is a simple execution of Foremost on the device image file we captured in the previous section. In this example, we're looking for Adobe Portable Document Format (PDF) files, Outlook PST files, and other raw data. As you will see you can search for more than one file type at a time; however, the more file types you search for at once, the longer the Foremost program may take to execute. Let's first look at the Foremost configuration file named foremost.conf stored in the local work directory of the investigation work we are doing.

 # ADOBE PDF #       pdf     y       5000000 %PDF %EOF\x0d REVERSE # Microsoft Outlook (2000-2003) Personal Storage Files         pst     y       400000000       \x21\x42\x4e\xa5\x6f\xb5\xa6 # RAW DATA         data    y       10000           root 

Now look at the command line to execute Foremost with this configuration, and then we'll come back to the configuration and explain the options and format in greater detail. The command line to execute Foremost is as follows:

 # foremost -v -o /data/analysis/fml/ -c ./foremost.conf \         /data/disk_images/target1-hda1.dd_image 

Once this is complete, any files found will be placed in the output directory /data/analysis/fm1. As Foremost looks directly at the data on the hard disk or disk image, not through the filesystem interface, the file names are not recoverable and Foremost will create files simply with numbers and the file extension you specified in the configuration file. For example, if it finds three PDF files, they will be the directory /data/analysis/fm1 named 00000001.pdf, 00000002.pdf, and so on. You'll need to use the specific application customarily used to open these files (or further analysis using another means) to verify these are the files you are looking for. Again, as Foremost operates on the data in the disk or disk image directly, you may find more than one copy of the file, or partial copies, as data that is recently accessed is often stored in virtual memory or temporary files while they are being used in an application. You will have recovered this temporary data as well, which may not be what you are looking for. Such is the downside of accessing data directly on the hard disk and not through the filesystem interfaces.

Now let's take a closer look at the Foremost configuration file format. Each command or file specification line in the Foremost configuration file (ones that don't start with a pound sign (#)) is read into Foremost for searches. The fields that each file's specification may have are as follows, where each field is separated by a tab or whitespace characters :

  • Extension The file extension for this file specification, or NONE if no file extension should be used.

  • Case Sensitive Flag ( y for yes, n for no), telling Foremost to search for the data in case-sensitive mode.

  • Size Maximum size of the file that Foremost should search for. This is very important, as this needs to be considered carefully (to be discussed shortly).

  • Header The header string/data to search for. The header may be ASCII characters or any binary data by using \0x[0-f] (for hexidecimal), or \[0-7][0-7][0-7] (for octal). \s may also be used to specify spaces, as well as question mark (?) for a wildcard (matching any single character). If you want to search for the question mark character, you must use the octal or hex representations \063 or \x3f. Another option, if you are searching for a lot of question mark characters, is to change the wildcard character property using the command wildcard <character> on any line in the file.

  • Footer The footer string/data to search for. The same rules as for header apply to footer.

  • Special Option The final command is an optional component to tell Foremost to search in a special way. At present there are two commands:

    • REVERSE The REVERSE option is used to search backwards , starting with footer, and ending with header or, if header is not specified, ending in the maximum file size specified. This is useful for files that don't have a well-defined header, but do have a well-defined footer.

    • NEXT NEXT operates similarly to the REVERSE option, except that the footer specification is actually a footer that is in another file, beyond the end of the file being searched for. This is useful for file specifications that don't have a well-defined footer, but are often stored near other files that are well known. This is not a foolproof method, but it is an advanced option that takes advantage of the way most people organize files, often having directories full of the same type of file. These files end up being stored on the filesystem near one another because they are usually accessed and created at similar times. This feature also lets you then use the same specification for the header and footer.

Note 

Foremost is one program that one of the contributing authors, Zachary Kanner, has added some functionality to, including the NEXT feature, as described above. We also highly recommend that if you develop file specifications for the files you are looking for and they are generally applicable to the community, please share them with the developers. The more common specifications that exist in the distribution itself, the easier Foremost is to use to find the data you are looking for. Visit http://foremost. sourceforge .net for more details.

Now Foremost is ready to search for the different files based upon the file specifications. As mentioned, you must tell Foremost what the maximum size that file can be. In order to find the whole file you are looking for, you must predict how big the largest file can be. If you guess wrong, Foremost will truncate each file at the maximum file limitthis is critical for file specifications that do not have footer sections. You can tell if the file was truncated or not by looking at the size of the output file. If it is exactly the size of the max size specification, it was likely truncated.

Note 

If the complete file is found, and the file specification has both a header and footer section, the files that are saved will be the exact size they were on the disk. Another factor to consider when choosing the max file size property is the amount of disk space you have available. It may be convenient to specify a very large size for files, for example, if you are looking for Microsoft Word documents, which don't have a well-defined file ending. The downside is there may be hundreds or thousands of these files on your filesystem, and each one may be as large as the max file size, or will end up being so because, remember, there is no way for Foremost to know where the end is. As you can see, you can quickly run out of disk space if you are not careful.

Are you wondering how Foremost really does its business? Foremost documentation refers to this process as finding a needle in a haystack, based upon the specification for the needles (you provide), and the disk image file as the hay. This is a reasonable analogy if you consider how large most hard disk partitions are these days and the common size of the files Foremost is searching for. Foremost reads the data from the disk input file or files (more than one can be specified), reads chunks of the file into memory, and searches for either the header or footer specification (depending upon the options specified). When the header or footer is found, the rest of the data is read into memory (up to the max size), and the file is copied and written out into the output directory. Foremost is amazingly fast, given the task at hand, despite its aggressive use of memory and CPU resources during execution. We recommend not performing too many other functions on the recovery workstation while Foremost is running, and keep in mind that you need to have enough memory in your recovery workstation to store the entire contents of one file you are searching foragain pick your max file size specifications carefully! Internally, the fast searching itself is performed using a Boyer-Moore Search Algorithm and jump table, which is a specially designed string-matching algorithm designed for fast searches on very large data sets.

There are a few other command line options and details you should be aware of when using Foremost. These are

  • -o <dir> The output directory you provide. This directory must not exist already. This is to maintain the forensic integritya nice safety check. This means that each output run of Foremost must be placed into a different directory.

  • -v Enable verbose ouput, which can be helpful for monitoring the execution of Foremost as it is running.

  • -q Tell Foremost to operate in quick mode, which operates on larger blocks, usually 512 bytes, instead of reading every byte into memory. This makes Foremost operate much more quickly, but it may miss some files if the files are very small or you are trying to recovery potentially corrupt data.

  • -s < n > Another useful option is the skip option, which allows you to skip n bytes from the beginning of the file(s) you specify. This allows you to start at a particular location in the file or restart from that location you specify (where it last left off, for example) a session that was already in progress and that had a problem with the file specification. This is extremely useful for interactive file specification development.

All in all, Foremost can be a powerful tool for low-level data recovery and forensic analysis of disk images and disk devices. Other tools, listed in the next section, may help further with advanced digital forensic analysis projects.



Extreme Exploits. Advanced Defenses Against Hardcore Hacks
Extreme Exploits: Advanced Defenses Against Hardcore Hacks (Hacking Exposed)
ISBN: 0072259558
EAN: 2147483647
Year: 2005
Pages: 120

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net