Cryptography: Theory and Practice:Shannon s Theory

cryptography: theory and practice Cryptography: Theory and Practice
by Douglas Stinson
CRC Press, CRC Press LLC
ISBN: 0849385210   Pub Date: 03/17/95
  

Previous Table of Contents Next


DEFINITION 2.6  Suppose L is a natural language. The entropy of L is defined to be the quantity

and the redundancy of L is defined to be

REMARK  HL measures the entropy letter of the language L. A random language would have entropy log2 . So the quantity RL measures the fraction of “excess characters,” which we think of as redundancy.

In the case of the English language, a tabulation of a large number of digrams and their frequencies would produce an estimate for H(P2). is one estimate obtained in this way. One could continue, tabulating trigrams, etc. and thus obtain an estimate for HL. In fact, various experiments have yielded the

empirical result that 1.0 ≤ HL ≤ 1.5. That is, the average information content in English is something like one to one and a half bits per letter!

Using 1.25 as our estimate of HL gives a redundancy of about 0.75. This means that the English language is 75% redundant! (This is not to say that one can arbitrarily remove three out of every four letters from English text and hope to still be able to read it. What it does mean is that it is possible to find a Huffman encoding of n-grams, for a large enough value of n, which will compress English text to about one quarter of its original length.)

Given probability distributions on and , we can define the induced probability distribution on , the set of n-grams of ciphertext (we already did this in the case n = 1). We have defined Pn to be a random variable representing an n-gram of plaintext. Similarly, define Cn to be a random variable representing an n-gram of ciphertext.

Given yCn, define

That is, K(y) is the set of keys K for which y is the encryption of a meaningful string of plaintext of length n, i.e., the set of “possible” keys, given that y is the ciphertext. If y is the observed sequence of ciphertext, then the number of spurious keys is |K(y)| - 1, since only one of the “possible” keys is the correct key. The average number of spurious keys (over all possible ciphertext strings of length n) is denoted by . Its value is computed to be

From Theorem 2.10, we have that

Also, we can use the estimate

provided n is reasonably large. Certainly,

Then, if , it follows that

Next, we relate the quantity H(K|Cn) to the number of spurious keys, . We compute as follows:

where we apply Jensen’s Inequality (Theorem 2.5) with f(x) = log2 x. Thus we obtain the inequality

Combining the two inequalities (2.1) and (2.2), we get that

In the case where keys are chosen equiprobably (which maximizes H(K)), we have the following result.

THEOREM 2.11

Suppose is a cryptosystem where and keys are chosen equiprobably. Let RL denote the redundancy of the underlying language. Then given a string of ciphertext of length n, where n is sufficiently large, the expected number of spurious keys, , satisfies

The quantity approaches 0 exponentially quickly as n increases. Also, note that the estimate may not be accurate for small values of n, especially since H(Pn)/n may not be a good estimate for HL if n is small.

We have one more concept to define.

DEFINITION 2.7  The unicity distance of a cryptosystem is defined to be the value of n, denoted by n0, at which the expected number of spurious keys becomes zero; i.e., the average amount of ciphertext required for an opponent to be able to uniquely compute the key, given enough computing time.

If we set in Theorem 2.11 and solve for n, we get an estimate for the unicity distance, namely

As an example, consider the Substitution Cipher. In this cryptosystem, and !. If we take RL = 0.75, then we get an estimate for the unicity distance of

This suggests that, given a ciphertext string of length at least 25, (usually) a unique decryption is possible.


Previous Table of Contents Next

Copyright © CRC Press LLC



Cryptography. Theory and Practice
Modern Cryptography: Theory and Practice
ISBN: 0130669431
EAN: 2147483647
Year: 1995
Pages: 133
Authors: Wenbo Mao

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net