# Appendix F Solutions to the Problems

1.  r g b y c m w R 1 0 0 1 0 1 1 G 0 1 0 1 1 0 1 B 0 0 1 0 1 1 1
2.  r g b y c m w Y 82 145 41 210 170 107 235 cb 90 54 240 16 166 202 128 cr 240 34 110 146 16 222 128
3. and
4. ; 864 - 720 = 144 pixels;
5. 857 pixels; 857 - 720 = 137 pixels; 10 μs
1. 720 × 576 × 25 × 2 × 8 = 166 Mbit/s
2. 720 × 480 × 30 × 2 x 8 = 166 Mbit/s
3. 360 × 288 × 25 × 1.5 × 8 = 31 Mbit/s
4. 360 × 240 × 30 × 1.5 × 8 = 31 Mbit/s
5. 37 Mbit/s
6. 31 Mbit/s
7. 31 Mbit/s
8. 4.7 Mbit/s
9. 1.4 Mbit/s
6. 94 73 194 184 50 204 207
7. 94 82 73 132 194 201 184 109 50 121 204 222 207 PSNR = 20.2 dB
1. A sinusoid with amplitude A has a peak-to-peak 2A = 2n ∆ ⇒ ∆ A = A 21-n

2. Peak-to-peak power of the sinusoid is

higher than its mean power ⇒ PSNR = 10.78 + 6n

3. 10.78 + 6n ≥ 58 ⇒ n ≥ 8 bits
1. |x| ≤ 16; y = 0 16 < |x| ≤ 32; y = ±24 32 < |x| < 48; y = ±40 etc.
2. |x| ≤ 16; y = ±8 16 < |x| ≤ 32; y = ±24 32 < |x| ≤ 48; y = ±40 etc.
1. 12 16 28 240 196 32 PSNR = 43.4 dB
2. 6 12 27 77 127 77 PSNR = 10.7 dB
3. 15 21 19 21 19 69 119 169 219 234 232 230 232 230

1. 364 15 -211 -26 -5 38 -2 -1
2. the basis vector of the second AC coefficient matches the input pixels
3. 35 82 190 250 200 150 101 23. Due to mismatch (approximating the cosine elements) some of the input pixels cannot be reconstructed, e.g. 81/82 and 100/101.
4.  quantised coefficients: 360 0 -216 -24 0 40 0 0 reconstructed pixels: PSNR = 30 dB 30 70 185 250 204 153 104 27
5.  reconstructed pixels: PSNR = 10.4 dB 128 128 128 128 128 128 128 128 reconstructed pixels: PSNR = 23.4 dB 28 87 169 227 227 169 87 28
1. mv(-1, -1)
2. mv(-1, -1)
1. 169
1. multiplications = 256 × 169, additions = 511 × 169
2. multiplications = 0, additions = 511 × 169
6. type

operations

Multiplications

TDL

23

23 × 256

23 × 511

TSS

25

25 × 256

25 × 511

CSA

17

17 × 256

17 × 511

OSA

13

13 × 256

13 × 511

1. = 010,
2. = 1,
3. = 00,
4. = 011, av bits = 1.8, entropy = 1.72
1. 1st bit in error decoded string = babbad
2. 3rd bit in error, decoded string = ccbbad
3. 5th bit in error, decoded string = cbcbad
7. lower value = 0.83875, upper value = 0.841875
1. the first three symbols = cbc
2. the first five symbols = cbcab
8. the same as 15
9. 11010110111
10. the same as 17
1. first bit in error = 0.01010110111 = 2-2 + 2-4 + 2-6 + 2-7 +2-9 + 2-10 + 2-11 = 0.33935546875 which is decoded to string bbacb
2. similarly, with the third bit in error the decimal number would be 0.96435546875, decoded to dbacb
3. with the fifth bit in error, the decimal number is 0.87060546875 and it is decoded to string cbacb.
1. P(z) = H0 (z)H1 (-z) should be factorised into two terms. The given P(z) is zero at z-1 = -1, hence it is divisible by 1 + z-1. Divide as many times as possible, that gives:

Thus in (i) the lowpass analysis filter will be:

and the high pass analysis filter is:

• H1(z) = 1 - z-1

In (ii) and the highpass:

which are the (5,3) subband filter pairs.

In (iii) and the highpass

which gives the second set of (4,4) subband filters.

In (iv) and

Any other combinations may be used, as desired.

1. . Thus with P(z) - P(-z) = 2z-1, results in one sample delay.
1. With a weighting factor of k, P(z) = k(1 + z-1)-4(-l + 4z-1-z-2), and using P(z) - P(-z) = 2z-m, gives k = 1/16 and m = 3.
2. The factor for the other set will be k = 3/256 and m = 5 samples delay.
2. In problem 3, P(z) is in fact type (iv) of problem 1. Hence it leads not only to the two sets of (5,3) and (4,4) filter pairs, but also to two new types of filter, given in (i) and (iv) of problem 1.
3. With

retaining H1(-z) = (1 + z-1)2 = 1 + 2z-1 + z-2, that gives the three-tap highpass filter of H1(z) = 1 - 2z-1 + z-2 and the remaining parts give the nine-tap lowpass filter

or

Had we divided the highpass filter coefficients, H1(z), by , and hence multiplying those of lowpass H0(z) by this amount, we get the 9/3 tap filter coefficients of Table 4.2.

4. Use G0(z) = H1(-z) and G1(z) = -H0(-z) to derive the synthesis filters
5. See pages 84 and 88
6. 33 bits
7. 29 bits
1. multiply all the luminance matrix elements by α = 50/25 = 2
2. , and small elements of the matrix will be 1 and larger ones become 2.
3. α = 0, hence all the matrix elements will be 1.
1. the same as problem 1.
1.  62 0 3 1 0 0 1 0 0 1 -1 0 1 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
2.  31 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2. for 50% quality: DIFF = 62 - 50 = 12, symbol_1 = 4; symbol_2 = 12 scanned pairs (3,1)(0,3)(0,1)(0,-1)(1,-1)(6,1)(6,1)(1,1)(1,1)(35,3) and the resultant events:

(3,1)(0,2)(0,1)(0,1)(1,1)(6,1)(6,1)(1,1)(1,1)(15,0)(15,0)(3,3)

for 25% quality: DIFF= 31-50 = -19, symbol_1 = 5, symbol_2 = -19-1 =-20

scanned pairs: (4, 1)(57, 1)

events: (4,1)(15,0)(15,0)(15,0)(9,1)

3. for DC: DIFF= -19 ⇒ CAT = 5; DIFF-1 = -20

VLC for CAT = 5 is 110 and -20 in binary is 11101100, hence the VLC for the DC coefficient is 11001100

for AC, using the AC VLC tables:

for each (15,0) the VLC is 11111111001

and for (9, 1) the VLC is 111111001

total number of bits: 8 + 3 x 11 + 9 = 50 bits.

4. At bit-plane 6 coefficient 65 at clean-up pass. At bit-plane 5 coefficient 65 at all passes and coefficient 50 at clean-up pass.
1. 33, 198
2. 396, 2376
1. ; QCIF:
1. MC
2. NOMC
3. NOMC
2. due to motion vector overhead
1. inter
2. intra
3. inter
3. For small values in intra mode DC still needs eight bits, while in inter mode it is less.
1. 63
2. 60
3. 3
4.  83 0 2 1 0 0 4 0 0 0 -1 0 3 0 0 0 0 0 0 0 3 -1 0 0 -1 1 0 0 0 0 0 0 0 0 3 0 2 0 0 0 0 0 0 0 0 0 0 1 -1 4 0 0 0 0 0 0 2 0 0 0 0 0 0 31

events: (0, 83)(4, 2)(0,1)(0, -1)(1, -1)(1, 1)(4, 3)(4, -1)(1,3)(1,3)(1,4)(2, -1) (3, 4)(0, 2) (3, 2)(20,1)(2, 31)

number of bits (including the sign bit): 20 + 9 + 5 + 5 + 6 + 6 + 11 + 8 + 8 + 8 + 9 + 7 + 11 + 5 + 8 + 20 + 20 = 166

no EOB is used, as the last coefficient is coded

5.  159 149 170 105 113 133
6. @384 kbit/s P = 6 and with q = 62 and P = 6, the buffer content is 36 kbits, left over capacity = 5000 x 8 - 36000 = 4000 bits
1. each operation = (2 × l5 + 1)2 + 8 = 969, total operations = 3 × 969 = 2907
2. ω = 3 × 15 = 45, total no of operations = (2 x 45 + 1)2 + 8 = 8289
1. the first B-picture is closer to its forward prediction than its backward prediction picture.
2. for the second B-picture, FWD = 9 and BWD = 5
3. for P-picture (2 × 13 + 1)2 + 8 = 737, and for each B-picture (2 × 5 + 1)2 + 8 + (2 x 9 + 1)2 + 8 = 498
1. Average and the complexity index is 50 × 1000 × 13 = 65 × 104.
2. (8 × 7) + (3 × 10) + 20 = 106

for I-pictures:

for P = 54.3 kbits and for B = 38 kbits.

3. The new index ratio for B becomes 7/1.4 = 5

(8 × 5) + (3 × 10) + 20 = 90, and bits for I = 128 kbits, for P = 64 and for B = 32 kbits

4. 20 + 10 + 10 + 20 + 8 × 7=116

for P, the average index of (10 + 10 + 20)/3 = 13.3 should be used, hence the target bit rates for I = 99.3 kbits, for P = 66.2 kbits and for B = 34.75 kbits.

1. all equal to 48 kbit/s
2. 60 + 3 × 20+8 × 15 = 240

For for P = 48 kbits and for B = 36 kbits

1. L
2. L
3. P
4. L
1.  4 -1 -5 0 2 0 0 -1 7 0 2 1 -1 0 0 -1 0 1 0 1 -1 0 0 0 0 -4 0 0 0 0 0 0 -2 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

2D-events: (0,4)(0,-1)(0,7)(2, -5)(1, 2) (0, 1)(1, -2) (0, -4)(1, 1)(0, 2)(1, -1) (0, 1)(7, -1)(2, -1)(4, 1)(5, 1) (2, -1)EOB (using Figure 6.12, the bits including the sign bit)

8 + 5 + 11 + 20 + 7 + 5 + 7 + 8 + 6 + 5 + 6 + 5 + 9 + 6 + 8 + 8 + 6 + 2 =132 bits

2. The base layer events: (0,4)(0,-1)(0,7)(2,-5)(1,2)(0,1)(1,-2)(0,4) + PBP the bits: 8 + 5 + 11 + 20 + 7 + 5 + 7 + 8 + 6 = 77 bits

The enhancement layer events: (1,1)(0,2)(1,-1)(0,1)(7,-1)(2,-1)(4,1)(5,1) (2,-1) + EOB the bits: 6 + 5 + 6 + 5 + 9 + 6 + 8 + 8 + 6 + 2 = 61

Total bits 77 + 61 = 138 bits, about 4.5 per cent extra over one-layer coding

3. base:

 2 0 2 0 1 0 0 0 4 0 1 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 -2 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

enhancement:

 0 -1 0 0 0 0 0 -1 0 0 0 1 -1 0 0 -1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Base layer events: (0, 2)(1, 4)(2, -2)(1, 1)(2, -1)(0, -2)(2, 1)(10, -1)EOB

Bits: 5 + 9 + 7 + 6 + 6 + 5 + 6 + 10 + 2 = 56 bits

Enhancement layer events: (1, -1)(6, 1)(4, 1)(2, -1)(0, 1)(10, -1)(10, 1)(2, -1)

EOB

Bits: 6 + 9 + 8 + 6 + 5 + 10 +10 + 6 + 2 = 62 bits

Total bits = 56 + 62 = 118

Note: the overall bit rate is less than the one layer, but the distortion will be larger.

4. Mbit/s, 28.8/8 = 3.5, thus three TV programmes.
1. for B = 1, α = 1 and β = 10-5
2. for B = 5, α=0.2 and β = 2 × 10-6
1. 34 980 Mbits = 4.3725 Gbytes
2. mean bit rate = 34 980/(90 x 60) = 6.478 Mbit/s
3. CBR = 90 × 60 × 20/8 = 13.5 Gbytes, peak-to-mean = 3.09
1. if error in any video bits, packet is in error; P = 47 × 8 × 10-7 = 3.76 × 10-7
2. if error in the header, packet is lost; P = 5 × 8 × 10-7 = 4 × 10-6
5. available link rate = 50 × 30 × 10-2 = 15 Mbit/s

total data to be sent = 34 980 × 53/47 = 39 445.5 Mbits

time required = 39 445.5/15 Mbits/s = 43 min 50 s

6. 25 × 4 = 100 Mbits/s, load and the error rate is P = .
7. With SNR scalability, assume 30 per cent more load, then total load = 100 x 1.3 = 130 Mbit/s, of which 65 Mbit/s is assigned to the base layer. Since the base layer has an absolute priority, then network load for the base layer:

and at this load P = 1.72 × 10-8.

For the enhancement layer, it sees the whole load, of 130 Mbits/s, thus the load will be:

and the error rate will be P = 0.088.

8. In data partitioning, with 4 per cent extra bits over one layer and 50 per cent to the base layer, the base layer load will be:

and the error rate P = 2.7 × 10-9. The enhancement layer has a load of 2 x 3.783 = 0.7566 that leads to an error rate of P = 5.3 × 10-5.

9. With spatial scalability, assuming 50 per cent more bits over one layer and 50 per cent assigned to the base layer, then the allocated bits to the base layer will be 75 Mbit/s. Base layer load is:

leading to P = 9.5 × 10-8. For the enhancement layer, the load will be more than 100 per cent and the loss probability will be P = 1!

1. Prepend 0 to all events of problem 3 of Chapter 8, except the last event, where 1 should be appended, and no need for EOB, e.g. first event (0, 4, 0) and the last event (1, 2, -1)
2. For x, the median of (3, 4, -1) is 3 and for y, the median of (-3, 3, 1) is 1. Hence the prediction vector is (3, 1) and MVD = (2 - 3 = -1; 1 - 1 = 0) = (-1, 0)
1. thus B1 = 150 - 8=142 and C1 = 115 + 8 = 123

2. d = -31.25, and d1 = 0, hence B and C do not change.
1. (3, 4),
2. (0, -3),
3. (1, 0.5),
4. (3, 2.6)
5. (-1, -1).
3. In order for a matrix to be orthonormal, multiplying each row by itself should be 1. Hence in row 1 and 3 (basis vectors 0 and 2), their values are 4, hence they should be divided by . In rows 2 and 4 their products give: 4 + 1 + 1 + 4 = 10, hence their values should be divided by .

Thus the forward 4 × 4 integer transform becomes

And the inverse transform is its transpose

As can be tested, this inverse transform is orthornormal, e.g.:

4. With the integer transform of problem 5, the two-dimensional transform coefficients will be

 431 -156 91 -15 43 52 30 1 -6 -46 -26 -7 -13 28 -19 14

The reconstructed pixels with the retained coefficients are; for N = 10:

 105 121 69 21 69 85 62 44 102 100 98 119 196 175 164 195

which gives an MSE error of 128.75, or PSNR of 27.03 dB. The reconstructed pixels with the retained 6 and 3 coefficients give PSNR of 22.90 and 18 dB, respectively.

With 4 × 4 DCT, these values are 26.7, 23.05 and 17.24 dB, respectively.

As we see the integer transform has the same performance as the DCT. If we see it is even better for some, this is due to the approximation of cosine elements.

5. index-0 = QP

1. index-8 = 2QP
2. index-16 = 4QP
3. index-24 = 8QP....index-40 = 32QP
4. index-48 = 64QP
6. Compared with H.263, at lower indices H.263 is coarser, e.g. at index-8 the quantiser parameter for H.263 is 8QP, but for H.26L is 2QP etc.

At higher indices, the largest quantiser parameter for H.263 is 31QP, but that of H.26L is 88 QP, hence at larger indices H.26L has a coarser quantiser.

1. For A: c0 = c2 = c3 = 1, index = 1 + 4 + 8 = 13, the given frequency table in Appendix D is for prob(0), hence prob(0) = 29789, but since A is a 1 pixel, then its probability prob(1) = 65535 - 29789 = 35746 out of 65535.

For B: c0 = c1 = c2 = c3 = c4 = 1, and index = 1 + 2 + 4 + 8 + 16 = 31, prob(0) = 6554. As we see this odd pixel of 0 among the 1s has a lower probability.

For C: c1 = c2 = c3 = c4 = c5 = c7 = 1, and the index becomes 190. Like pixel A the prob(0) = 91, but its prob(1) = 65535 - 91 = 65444 out of 65535, which is as expected.

2. The chain code is 0, 1, 0, 1, 7, 7. The differential chain code will be: 0, 1, -1, 1, -2, 0 with bits 1+2 + 3 + 2 + 5 + 1 = 14 bits
3. At level 2 the indices (without swapping) will be 0 for the two blank blocks and index = 27 × 2 + 9 × 2 + 0 + 2 = 74 and index =0 + 0 + 3x2 + 2 = 8 for the two blocks. At the upper level the index is: index = 0 + 0 + 3 x 1 + 1 =4
4. The coefficients of the shape adaptive DCT will be

 127 -40 1 10 -15 23 -7 -6

confined to the top left corner, while for those of the normal DCT with padded zeros, the significant coefficients are scattered all over the 8 × 8 area.

Standard Codecs: Image Compression to Advanced Video Coding (IET Telecommunications Series)
ISBN: 0852967101
EAN: 2147483647
Year: 2005
Pages: 148
Authors: M. Ghanbari

Similar book on Amazon