4.3 Hashing


4.3 Hashing

Take a volume of reasonable size such as a dictionary. You have a need to send that dictionary to me in electronic form and at the same time ensure that nobody has captured the dictionary and changed even a space in the text. One way we could do this is to laboriously sit and compare what you had sent with what I received. To make things a bit easier, we could have a computer compare the two copies. While effective, these solutions are not optimal. At the very least, this would be a time-consuming operation, poring through thousands of pages of text. Furthermore, what if we did not have two copies to compare? After all, if you sent me another copy, how do we know that will not be changed either? Or, if we were just storing the file on a computer and we wanted to make sure that nobody had changed the file since the last time you or I looked at it?

What we need is some way to mathematically represent the text of the dictionary in a shorter form than the original. At the same time, we would want to make sure that the mathematical representation of the dictionary would change even if the smallest change were made to our original file. It would be even more helpful if you cannot take the mathematical representation and work backward. That is, you cannot reverse the process upon our hash value and come up with the contents of the dictionary! If we could figure this out we would have a very handy formula to keep around. Not only would it let us know if anything in the file had changed, it would not allow someone who had captured the mathematical representation to know what the original was.

Before we spend lots of time trying to figure out how to make such a formula, however, someone would surely note to us that just such a set of algorithms has already been created. Known as hash algorithms, these formulas are used to condense data into what are known as message digests. To see how they operate, let us try a test.

To illustrate the usefulness of a hash, I have created a text file that reads:

      "Security is now our number one priority." 

Saved, I ran an MD5 hash utility on the file. The following is the output:

      07338ca773ad441b465e60ce3f461e98 

I then changed the text to:

      "Security is now our number one priority." 

Notice the difference? There is not much of one other than the insertion of a single space. Saving the file and running the same MD5 hash algorithm produces the following result:

      d1b9c4d605306d4e0cf80f41317fcd8f 

Even with the insertion of a single space, the hash not only changes, but it changes dramatically. In fact, one of the goals of a good hash algorithm is its ability to produce dramatic and obviously different hash outputs after any change to the document. If only the fifteenth character were to change from a "1" to a "2," then determining the change in value between hash outputs becomes less obvious and less helpful.

A more practical example would be a legal contract. Anyone who was sent a legal contract along with the hash value would immediately know that the contents of the contract had changed simply by comparing the hash value. The fact that the hash value has changed is proof enough that the contents of the text had changed.

Note that the hash values given above do not indicate where the change has occurred, only that it has changed. For many applications, this is enough however.

Now imagine the applications of this technology. First, we can make sure that the contracts that you and I agree upon have not changed. When dealing with electronic documents, it would be easy enough to change a contract by adding a couple of extra zeros to the left of a decimal point, right? You and I can both use a hash like this to electronically verify that the document we have agreed upon is identical. So much as an extra carriage return at the end of the document would change the value of the hash.

If you were to ever receive an e-mail from me, you would notice that each of my e-mails ends with a signed hash much like the following:

       — — -BEGIN PGP SIGNATURE — — -      Version: PGP 7.0.4      iQA/AwUBPIkgAy1iZLqbmZBAEQIvGgCeInyJLt8avLwYzcVIBjC2uO      br0i4AoNsN      DC1jB0TuhfD15tvo9X9S8AKE       = JWjq       - — -END PGP SIGNATURE — — - 

I use my private key to digitally sign a hash of the e-mail that I have sent. Not only can you make sure that the e-mail was actually from me, but that no changes were made to the e-mail after I had sent it. [4] In other words, my digital signature also provides non-repudiation services. I cannot deny the sending of that -mail if it was signed with my digital signature.

We can also use hashing to make sure that files on our hard drives do not change. We know that attackers commonly like to install Trojan versions of the software that we use on our systems. Detecting such software can be very difficult. If we were to run a hash value over every executable on our servers, any change to the program itself would generate an entirely different hash when we run the hashing program again. This would immediately tip us off as to what program has changed. While we may not know what has changed, we know enough to start investigating further.

The same concept can be applied to data that we send. We will see in our discussion of VPNs that IPSec uses hash values on entire packets to ensure that the packet that was sent is the same as the packet that is received at the other side.

In short, for every instance in which there is a need for integrity of data, hashing is the technology of choice.

[4]This fact has also been used against me in certain circumstances where I would write and send faster than I would think.




Network Perimeter Security. Building Defense In-Depth
Network Perimeter Security: Building Defense In-Depth
ISBN: 0849316286
EAN: 2147483647
Year: 2004
Pages: 119
Authors: Cliff Riggs

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net