Hexdump

 < Day Day Up > 



Hexdump is a file viewing tool that will operate in a mode that performs the least amount of interpretation while displaying the contents of the input file. Because of this functionality, hexdump is a natural and efficient tool to use for determining file type and purpose of its contents. Furthermore, hexdump comes bundled with popular brands of noncommercial Unix operating systems such as Linux and FreeBSD. This means that hexdump is easily obtainable because the source code to these operating systems is open-source.

Implementation

In its simplest form, hexdump is used by an investigator to read a file’s contents and display them with raw formatting. When executing hexdump in this mode, the only parameter fed to it is the input file’s name. After typing this command, for example, the output of a file named 1.tiff is shown in the following illustration:

forensic# hexdump 1.tiff

click to expand

Because it is well known that a Tag Image File Format (TIFF) file begins with the bytes 49 49 00 2A, in hexadecimal, the file’s type is readily available in the output after a little human analysis. If this header is not known to you, do not fret; most Unix systems include a file that contains file signatures in /usr/share/magic or /usr/share/misc/magic. An excerpt from the magic file shows the TIFF header:

# Tag Image File Format, from Daniel Quinlan (quinlan@yggdrasil.com) # The second word of TIFF files is the TIFF version number, 42, which has # never changed.  The TIFF specification recommends testing for it. 0       string          MM\x00\x2a      TIFF image data, big-endian 0       string          II\x2a\x00      TIFF image data, little-endian

The Unix system file command uses this information to determine an unknown file type.

The output of hexdump, as shown in the preceding illustration, is formatted such that the leftmost column contains the byte offset within 1.tiff, in hexadecimal. The bytes of the input file are displayed across the rows after the offset. In this example, you can see that the third row down contains only an asterisk (*), which means that all rows after the one last displayed are duplicates.

In some cases, it may be advantageous for you to view the output of hexdump in hexadecimal and ASCII formats simultaneously. The hexdump program bundled with FreeBSD will perform the conversion automatically through the use of the -C switch. Let’s look at another file type using this switch:

forensic# hexdump -C suspiciousfile.bin

And here’s the output:

click to expand

You can easily discern that this file contains the header of a GIF, version 89a, graphic file (and if you didn’t know this, you could check the magic file). However, if FreeBSD is not readily available to perform the output format conversion using one command-line switch, you could write a small format file to perform a similar conversion. This happens when the Linux hexdump tool is used in an investigation. To overcome this problem, you can create the following file and name it hexdump.fmt:

"%12.12_ad  " 16/1 "%02X " "\t" 16/1 "%_p" "\n"

After the file has been created, you can use it in conjunction with hexdump in the following manner:

forensic# hexdump -f hexdump.fmt suspiciousfile.bin

Figure 25-1 demonstrates the output of hexdump using the format specification in hexdump.fmt.

click to expand
Figure 25-1: The output of hexdump for suspiciousfile.bin

The output format specification of hexdump is not simple to understand. Basically, the format consists of one or more tokens. Each token is a symbol that specifies either how the byte offset is displayed or the output format for the file’s contents. Additionally, an optional specification of byte count and iteration can be instantiated for each token, in the following form:

<iteration>/<byte count> <token>

In addition to the well-known printf statements known to C/C++ programmers, tokens can contain the following format parameters (this is also available in the hexdump man page):

  • _a[dox]   This parameter displays the input offset, which is cumulative across input files, of the next byte to be displayed. Specify the display base as decimal, octal, or hexadecimal by appending d, o, or x, respectively.

  • _A[dox]   Although this parameter is identical to the _a conversion string, it is performed only when all of the input data has been processed.

  • _c   This parameter outputs characters in the default character set. Those characters that are representable by standard escape notation are displayed as two character strings; nonprinting characters are displayed in three character, zero-padded octal.

  • _p   This parameter outputs characters within the default character set. Nonprinting characters are displayed as a single “.".

  • _u   This parameter outputs U.S. ASCII characters; however, control characters are displayed using lowercase names:

  • %_c, %_p, %_u, %c   Only one byte counts.

  • %d, %i, %o, %u, %X, %x   This is the four-byte default. One, two, and four byte counts are supported.

  • %E, %e, %f, %G, %g   This is the eight- byte default. Four byte counts are supported.

Characters greater than 0xff, hexadecimal are displayed as hexadecimal strings.

Therefore, the hexdump.fmt file presented earlier is interpreted as follows:

"%12.12_ad  " 16/1 "%02X " "\t" 16/1 "%_p" "\n"
  1. The first token on the first line formats the byte offset. It is 12 digits long and padded with 12 zeros. The byte offset is displayed in decimal, base 10, notation. Two additional spaces appear after the byte offset before the actual file data begins.

  2. The second token on the first line is repeated 16 times, and 1 byte is read for each iteration. When it is output, it is in a two-digit hexadecimal format for each iteration. Therefore, each token represents a byte in well-formed columns.

  3. The second line reiterates the output for the same 16 read bytes, this time formatting the bytes into readable ASCII (however, if it is not printable a dot [.] is inserted). The \t represents a TAB insertion before the outputted bytes. If this line in the format file was moved up to the first line, a new series would be read, which is not what we are trying to accomplish. Therefore, this token has to be on a new line.

  4. The third line outputs a newline character to the output.

Hexdump is an extremely powerful and efficient utility to use for viewing the contents of files in a forensic investigation. With a little knowledge of hexdump format files, an analyst can view the data in any manner desirable. Therefore, hexdump is a tool any forensic investigator should not be without. Luckily, this tool is usually installed within the base installation of most Unix operating systems.



 < Day Day Up > 



Anti-Hacker Tool Kit
Anti-Hacker Tool Kit, Third Edition
ISBN: 0072262877
EAN: 2147483647
Year: 2004
Pages: 189

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net