Cryptography: Theory and Practice:Shannon s Theory | Modern Cryptography: Theory and Practice

Cryptography: Theory and Practice
by Douglas Stinson
CRC Press, CRC Press LLC
ISBN: 0849385210 Pub Date: 03/17/95

Table of Contents

2.4 Spurious Keys and Unicity Distance

In this section, we apply the entropy results we have proved to cryptosystems. First, we show a fundamental relationship exists among the entropies of the components of a cryptosystem. The conditional entropy H(K|C) is called the key equivocation, and is a measure of how much information about the key is revealed by the ciphertext.

THEOREM 2.10

Let be a cryptosystem. Then

PROOF First, observe that H(K, P, C) = H(C|K, P) + H(K, P). Now, the key and plaintext determine the ciphertext uniquely, since y = e_K(x). This implies that H(C|K, P) = 0. Hence, H(K, P, C) = H(K, P). But K and P are independent, so H(K, P) = H(K) + H(P). Hence,

In a similar fashion, since the key and ciphertext determine the plaintext uniquely (i.e., x = d_K(y)), we have that H(P|K, C) = 0 and hence H(K, P, C) = H(K, C).

Now, we compute as follows:

giving the desired formula.

Let us return to Example 2.1 to illustrate this result.

Example 2.1 (Cont.)

We have already computed and . Theorem 2.10 tells us that . This can be verified directly by applying the definition of conditional entropy, as follows. First, we need to compute the probabilities p(K_i|j), 1 ≤ i ≤ 3, 1 ≤ j ≤ 4. This can be done using Bayes’ Theorem, and the following values result:

Now we compute

agreeing with the value predicted by Theorem 2.10.

Suppose the cryptosystem being used, and a string of plaintext

is encrypted with one key, producing a string of ciphertext

Recall that the basic goal of the cryptanalyst is to determine the key. We are looking at ciphertext-only attacks, and we assume that Oscar has infinite computational resources. We also assume that Oscar knows that the plaintext is a “natural” language, such as English. In general, Oscar will be able to rule out certain keys, but many “possible” keys may remain, only one of which is the correct key. The remaining possible, but incorrect, keys are called spurious keys.

For example, suppose Oscar obtains the ciphertext string WNAJW, which has been obtained by encryption using a shift cipher. It is easy to see that there are only two “meaningful” plaintext strings, namely river and arena, corresponding respectively to the possible encryption keys F (= 5) and W (= 22). Of these two keys, one will be the correct key and the other will be spurious. (Actually, it is moderately difficult to find a ciphertext of length 5 for the Shift Cipher that has two meaningful decryptions; the reader might search for other examples.)

Our goal is to prove a bound on the expected number of spurious keys. First, we have to define what we mean by the entropy (per letter) of a natural language L, which we denote H_L. H_L should be a measure of the average information per letter in a “meaningful” string of plaintext. (Note that a random string of alphabetic characters would have entropy (per letter) equal to log₂ .) As a “first-order” approximation to H_L, we could take H(P). In the case where L is the English language, we get by using the probability distribution given in Table 1.1.

Of course, successive letters in a language are not independent, and correlations among successive letters reduce the entropy. For example, in English, the letter “Q” is always followed by the letter “U.” For a “second-order” approximation, we would compute the entropy of the probability distribution of all digrams and then divide by 2. In general, define Pⁿ to be the random variable that has as its probability distribution that of all n-grams of plaintext. We make use of the following definitions.

Table of Contents