Statistical tests can reveal that an image has been modified by determining that its statistical properties deviate from a norm. Some tests are independent of the data format and measure only the entropy of the redundant data. Expect images with hidden data to have higher entropy.
Stegdetect is an automated tool for detecting steganographic content in images. It is capable of detecting several different steganographic methods to embed hidden information in JPEG images. Currently, the detectable schemes are (Figure 9.1):
jphide for UNIX and Windows
F5 (header analysis)
appendx and camouflage
An example of the output looks like this:
$ stegdetect *.jpg cold_dvd.jpg : outguess(old)(***) jphide(*) dscf0001.jpg : negative dscf0002.jpg : jsteg(***) dscf0003.jpg : jphide(***) […] $ stegbreak -tj dscf0002.jpg Loaded 1 files… dscf0002.jpg : jsteg(wonderland) Processed 1 file, found 1 embedding. Time: 36 seconds; Cracks: 324123, 8915 c/s
Stegbreak is a program that uses dictionary guessing to break the encoding password. Stegbreak is used to launch dictionary attacks against jsteg-shell, jphide, and outguess 0.13b.
A dictionary attack is a brute-force attack that is generally a threat to all passwords. Basically, a dictionary attack works by looking for passwords that are part of a specific list, such as an English dictionary. We will now take a look at how a dictionary attack works on a steganographic system.
Steganographic systems embed header information in front of the hidden message. This header contains information about the length of the message, compression methods, and other important details. A dictionary attack, using the Stegbreak program, will choose a key from the dictionary and use it to try and retrieve the header information. If the header matches, the key has been guessed. The Stegbreak dictionary contains about 1,800,000 words and phrases, including words from the English, German, and French languages; science fiction novels; the Koran; famous movies and songs, etc.
Attacks and analysis on hidden information may take several forms: detecting, extracting, and disabling or destroying hidden information. Images with too high a payload may display distortions from hidden information. Selecting the proper combination of steganography tools and carriers is important to successful information hiding.
Some images may become quite degraded with even small amounts of embedded information. This "visible noise" will give away the existence of hidden information. The same is true with audio. Echoes and shadow signals reduce the chance of audible noise, but they can be detected with little processing.
Only after evaluating many original images and stego-images with regard to color composition, luminance, and pixel relationships do anomalies point to characteristics that are not "normal" in other images. Patterns become visible when evaluating many images used for applying steganography. Such patterns are unusual sorting of color palettes, relationships between colors in color indexes, and exaggerated "noise."
An approach used to identify such patterns is to compare the original cover-images with the stego-images and note a visible difference, which is the known-cover attack. Minute changes are readily noticeable when comparing the cover- and stego-images. In making these comparisons with numerous images, patterns begin to emerge as possible signatures of steganography software.
This refers back to the technique of hiding data in spaces within text. This form of text semagram uses the white space in a document to denote binary values. The white space can be between the individual words, the sentences, or even between the paragraphs. Almost any combination is possible, but to a point, if the text appears to have too much white space it may be subject to scrutiny. While this form of steganography can work effectively, it has some big drawbacks. First, if the document is digital any modern word processor would be able to show the spacing irregularities or, worse, reformat the document and destroy the hidden information. The other drawback is that this method does not transmit a large amount of information easily, which can limit its practicality.
There are not only spaces between words but also tiny spaces between some letters, either to form a binary code out of the frequency of spaces/no spaces or to indicate that the letter following after the space is part of the secret message. To the naked eye nothing may be apparent, but when put through the scrutiny of a modern word processor the pattern will become very apparent.
Some tools have characteristics that are unique among stego-tools. In some steganography programs the color palettes have unique characteristics that do not appear anywhere else. For example, the Hide and Seek program creates color palette entries that are divisible by 4 for all bit values. The palette modification creates a detectable steganography signature.
In TCP/IP, there are a number of methods available whereby covert channels can be established and data can be surreptitiously passed between hosts. This method can be used in a variety of areas:
Bypassing packet filters, network sniffers, and "dirty word" search engines
Encapsulating encrypted or nonencrypted information within otherwise normal packets of information for secret transmission through networks that prohibit such activity (TCP/IP steganography)
Concealing locations of transmitted data by "bouncing" forged packets with encapsulated information off innocuous Internet sites
Protection from this technique would start with the use of an application proxy firewall system. An application proxy firewall is designed to keep packets from logically separated networks from passing directly to each other. A packet-filter firewall is another option, but is not as effective as the application proxy firewall.
Detection of these techniques can be difficult. If the information in the packet data is encrypted or is "bounced" from another server, it can be very difficult to determine where the packet originated. One way to determine where a forged packet originated is to put a sniffer on the inbound side of the server.
The patchwork algorithm allows for the detection of a single, specific bit in an image. Patchwork will embed a specific statistic in a host image, a small watermark that tells whether a larger watermark is embedded within an image. In short, patchwork is an indicator that tells a program that the rest of the watermark is present. While this method by itself works quite well, there have been a number of performance improvements made to the patchwork process, including treating patches at several points rather than just one and using visibility masks to avoid putting patches where they would be easily noticed. In some instances, if this technique is used with too much payload, a repetitive pattern may appear in the image.