Cryptography: Theory and Practice by Douglas Stinson CRC Press, CRC Press LLC ISBN: 0849385210 Pub Date: 03/17/95 |
Previous | Table of Contents | Next |
In the previous section, we discussed the concept of perfect secrecy. We restricted our attention to the special situation where a key is used for only one encryption. We now want to look at what happens as more and more plaintexts are encrypted using the same key, and how likely a cryptanalyst will be able to carry out a successful ciphertext-only attack, given sufficient time.
The basic tool in studying this question is the idea of entropy, a concept from information theory introduced by Shannon in 1948. Entropy can be thought of as a mathematical measure of information or uncertainty, and is computed as a function of a probability distribution.
Suppose we have a random variable X which takes on a finite set of values according to a probability distribution p(X). What is the information gained by an event which takes place according to distribution p(X)? Equivalently, if the event has not (yet) taken place, what is the uncertainty about the outcome? This quantity is called the entropy of X and is denoted by H(X).
These ideas may seem rather abstract, so lets look at a more concrete example. Suppose our random variable X represents the toss of a coin. The probability distribution is p(heads) = p(tails) = 1/2. It would seem reasonable to say that the information, or entropy, of a coin toss is one bit, since we could encode heads by 1 and tails by 0, for example. In a similar fashion, the entropy of n independent coin tosses is n, since the n coin tosses can be encoded by a bit string of length n.
As a slightly more complicated example, suppose we have a random variable X that takes on three possible values x1, x2, x3 with probabilities 1/2, 1/4, 1/4 respectively. The most efficient encoding of the three possible outcomes is to encode x1, as 0, to encode x2 as 10 and to encode x3 as 11. Then the average number of bits in an encoding of X is
The above examples suggest that an event which occurs with probability 2-n can be encoded as a bit string of length n. More generally, we could imagine that an event occurring with probability p might be encoded by a bit string of length approximately - log2 p. Given an arbitrary probability distribution p1, p2, ..., pn for a random variable X, we take the weighted average of the quatities - log2 pi to be our measure of information. This motivates the following formal definition.
DEFINITION 2.3 Suppose X is a random variable which takes on a finite set of values according to a probability distribution p(X). Then, the entropy of this probability distribution is defined to be the quantity
If the possible values of X are xi, 1 ≤ i ≤ n, then we have
REMARK Observe that log2 pi is undefined if pi = 0. Hence, entropy is sometimes defined to be the relevant sum over all the non-zero probabilities. Since limx →0 x log2 x = 0, there is no real difficulty with allowing pi = 0 for some i. However, we will implicitly assume that, when computing the entropy of a probability distribution pi, the sum is taken over the indices i such that pi ≠ 0. Also, we note that the choice of two as the base of the logarithms is arbitrary: another base would only change the value of the entropy by a constant factor.
Note that if pi = 1/n for 1 ≤ i ≤ n, then H(X) = log2 n. Also, it is easy to see that H(X) ≥ 0, and H(X) = 0 if and only if pi = 1 for some i and pj = 0 for all j ≠ i.
Let us look at the entropy of the various components of a cryptosystem. We can think of the key as being a random variable K that takes on values according to the probability distribution , and hence we can compute the entropy H(K). Similarly, we can compute entropies H(P) and H(C) associated with plaintext and ciphertext probability distributions, respectively.
To illustrate, we compute the entropies of the cryptosystem of Example 2.1.
Example 2.1 (Cont.)
We compute as follows:
Similar calculations yield H(K) = 1.5 and .
Previous | Table of Contents | Next |
Copyright © CRC Press LLC