Appendix F Solutions to the Problems

  1.  

    r

    g

    b

    y

    c

    m

    w

    R

    1

    0

    0

    1

    0

    1

    1

    G

    0

    1

    0

    1

    1

    0

    1

    B

    0

    0

    1

    0

    1

    1

    1

  2.  

    r

    g

    b

    y

    c

    m

    w

    Y

    82

    145

    41

    210

    170

    107

    235

    cb

    90

    54

    240

    16

    166

    202

    128

    cr

    240

    34

    110

    146

    16

    222

    128

  3. and
  4. ; 864 - 720 = 144 pixels;
  5. 857 pixels; 857 - 720 = 137 pixels; 10 μs
    1. 720 × 576 × 25 × 2 × 8 = 166 Mbit/s
    2. 720 × 480 × 30 × 2 x 8 = 166 Mbit/s
    3. 360 × 288 × 25 × 1.5 × 8 = 31 Mbit/s
    4. 360 × 240 × 30 × 1.5 × 8 = 31 Mbit/s
    5. 37 Mbit/s
    6. 31 Mbit/s
    7. 31 Mbit/s
    8. 4.7 Mbit/s
    9. 1.4 Mbit/s
  6. 94 73 194 184 50 204 207
  7. 94 82 73 132 194 201 184 109 50 121 204 222 207 PSNR = 20.2 dB
    1. A sinusoid with amplitude A has a peak-to-peak 2A = 2n ∆ ⇒ ∆ A = A 21-n

      click to expand

    2. Peak-to-peak power of the sinusoid is

      higher than its mean power ⇒ PSNR = 10.78 + 6n

    3. 10.78 + 6n ≥ 58 ⇒ n ≥ 8 bits

    1. |x| ≤ 16; y = 0 16 < |x| ≤ 32; y = ±24 32 < |x| < 48; y = ±40 etc.
    2. |x| ≤ 16; y = ±8 16 < |x| ≤ 32; y = ±24 32 < |x| ≤ 48; y = ±40 etc.
  1. 12 16 28 240 196 32 PSNR = 43.4 dB
  2. 6 12 27 77 127 77 PSNR = 10.7 dB
  3. 15 21 19 21 19 69 119 169 219 234 232 230 232 230

    click to expand

  4. click to expand
    1. 364 15 -211 -26 -5 38 -2 -1
    2. the basis vector of the second AC coefficient matches the input pixels
    3. 35 82 190 250 200 150 101 23. Due to mismatch (approximating the cosine elements) some of the input pixels cannot be reconstructed, e.g. 81/82 and 100/101.
  5. quantised coefficients:

    360

    0

    -216

    -24

    0

    40

    0

    0

    reconstructed pixels: PSNR = 30 dB

    30

    70

    185

    250

    204

    153

    104

    27

    1. reconstructed pixels: PSNR = 10.4 dB

    128

    128

    128

    128

    128

    128

    128

    128

    1. reconstructed pixels: PSNR = 23.4 dB

    28

    87

    169

    227

    227

    169

    87

    28

    1. mv(-1, -1)
    2. mv(-1, -1)
    1. 169
      1. multiplications = 256 × 169, additions = 511 × 169
      2. multiplications = 0, additions = 511 × 169
  6. type

    operations

    Multiplications

    additions

    TDL

    23

    23 × 256

    23 × 511

    TSS

    25

    25 × 256

    25 × 511

    CSA

    17

    17 × 256

    17 × 511

    OSA

    13

    13 × 256

    13 × 511

    1. = 010,
    2. = 1,
    3. = 00,
    4. = 011, av bits = 1.8, entropy = 1.72
    1. cbdad = 001011010011
      1. 1st bit in error decoded string = babbad
      2. 3rd bit in error, decoded string = ccbbad
      3. 5th bit in error, decoded string = cbcbad
  7. lower value = 0.83875, upper value = 0.841875
    1. the first three symbols = cbc
    2. the first five symbols = cbcab
  8. the same as 15
  9. 11010110111
  10. the same as 17
    1. first bit in error = 0.01010110111 = 2-2 + 2-4 + 2-6 + 2-7 +2-9 + 2-10 + 2-11 = 0.33935546875 which is decoded to string bbacb
    2. similarly, with the third bit in error the decimal number would be 0.96435546875, decoded to dbacb
    3. with the fifth bit in error, the decimal number is 0.87060546875 and it is decoded to string cbacb.

  1. P(z) = H0 (z)H1 (-z) should be factorised into two terms. The given P(z) is zero at z-1 = -1, hence it is divisible by 1 + z-1. Divide as many times as possible, that gives:

    Thus in (i) the lowpass analysis filter will be:

    and the high pass analysis filter is:

    • H1(z) = 1 - z-1

    In (ii) and the highpass:

    which are the (5,3) subband filter pairs.

    In (iii) and the highpass

    which gives the second set of (4,4) subband filters.

    In (iv) and

    click to expand

    Any other combinations may be used, as desired.

    1. click to expand. Thus with P(z) - P(-z) = 2z-1, results in one sample delay.
    1. With a weighting factor of k, P(z) = k(1 + z-1)-4(-l + 4z-1-z-2), and using P(z) - P(-z) = 2z-m, gives k = 1/16 and m = 3.
    2. The factor for the other set will be k = 3/256 and m = 5 samples delay.
  2. In problem 3, P(z) is in fact type (iv) of problem 1. Hence it leads not only to the two sets of (5,3) and (4,4) filter pairs, but also to two new types of filter, given in (i) and (iv) of problem 1.
  3. With

    click to expand

    retaining H1(-z) = (1 + z-1)2 = 1 + 2z-1 + z-2, that gives the three-tap highpass filter of H1(z) = 1 - 2z-1 + z-2 and the remaining parts give the nine-tap lowpass filter

    click to expand

    or

    Had we divided the highpass filter coefficients, H1(z), by , and hence multiplying those of lowpass H0(z) by this amount, we get the 9/3 tap filter coefficients of Table 4.2.

  4. Use G0(z) = H1(-z) and G1(z) = -H0(-z) to derive the synthesis filters
  5. See pages 84 and 88
  6. 33 bits
  7. 29 bits

    1. multiply all the luminance matrix elements by α = 50/25 = 2
    2. , and small elements of the matrix will be 1 and larger ones become 2.
    3. α = 0, hence all the matrix elements will be 1.
  1. the same as problem 1.
    1. 62

      0

      3

      1

      0

      0

      1

      0

      0

      1

      -1

      0

      1

      0

      0

      0

      0

      0

      0

      0

      1

      0

      0

      0

      -1

      0

      0

      0

      0

      0

      0

      0

      0

      0

      1

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      3

    2. 31

      0

      1

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      0

      1

  2. for 50% quality: DIFF = 62 - 50 = 12, symbol_1 = 4; symbol_2 = 12 scanned pairs (3,1)(0,3)(0,1)(0,-1)(1,-1)(6,1)(6,1)(1,1)(1,1)(35,3) and the resultant events:

    (3,1)(0,2)(0,1)(0,1)(1,1)(6,1)(6,1)(1,1)(1,1)(15,0)(15,0)(3,3)

    for 25% quality: DIFF= 31-50 = -19, symbol_1 = 5, symbol_2 = -19-1 =-20

    scanned pairs: (4, 1)(57, 1)

    events: (4,1)(15,0)(15,0)(15,0)(9,1)

  3. for DC: DIFF= -19 ⇒ CAT = 5; DIFF-1 = -20

    VLC for CAT = 5 is 110 and -20 in binary is 11101100, hence the VLC for the DC coefficient is 11001100

    for AC, using the AC VLC tables:

    for each (15,0) the VLC is 11111111001

    and for (9, 1) the VLC is 111111001

    total number of bits: 8 + 3 x 11 + 9 = 50 bits.

  4. At bit-plane 6 coefficient 65 at clean-up pass. At bit-plane 5 coefficient 65 at all passes and coefficient 50 at clean-up pass.

    1. 33, 198
    2. 396, 2376
  1. ; QCIF:
    1. MC
    2. NOMC
    3. NOMC
  2. due to motion vector overhead
    1. inter
    2. intra
    3. inter
  3. For small values in intra mode DC still needs eight bits, while in inter mode it is less.
    1. 63
    2. 60
    3. 3
  4. 83

    0

    2

    1

    0

    0

    4

    0

    0

    0

    -1

    0

    3

    0

    0

    0

    0

    0

    0

    0

    3

    -1

    0

    0

    -1

    1

    0

    0

    0

    0

    0

    0

    0

    0

    3

    0

    2

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    1

    -1

    4

    0

    0

    0

    0

    0

    0

    2

    0

    0

    0

    0

    0

    0

    31

    events: (0, 83)(4, 2)(0,1)(0, -1)(1, -1)(1, 1)(4, 3)(4, -1)(1,3)(1,3)(1,4)(2, -1) (3, 4)(0, 2) (3, 2)(20,1)(2, 31)

    number of bits (including the sign bit): 20 + 9 + 5 + 5 + 6 + 6 + 11 + 8 + 8 + 8 + 9 + 7 + 11 + 5 + 8 + 20 + 20 = 166

    no EOB is used, as the last coefficient is coded

  5. 159

    149

    170

    105

    113

    133

  6. @384 kbit/s P = 6 and with q = 62 and P = 6, the buffer content is 36 kbits, left over capacity = 5000 x 8 - 36000 = 4000 bits

    1. each operation = (2 × l5 + 1)2 + 8 = 969, total operations = 3 × 969 = 2907
    2. ω = 3 × 15 = 45, total no of operations = (2 x 45 + 1)2 + 8 = 8289
    1. the first B-picture is closer to its forward prediction than its backward prediction picture.
    2. for the second B-picture, FWD = 9 and BWD = 5
    3. for P-picture (2 × 13 + 1)2 + 8 = 737, and for each B-picture (2 × 5 + 1)2 + 8 + (2 x 9 + 1)2 + 8 = 498
  1. Average and the complexity index is 50 × 1000 × 13 = 65 × 104.
  2. (8 × 7) + (3 × 10) + 20 = 106

    for I-pictures:

    for P = 54.3 kbits and for B = 38 kbits.

  3. The new index ratio for B becomes 7/1.4 = 5

    (8 × 5) + (3 × 10) + 20 = 90, and bits for I = 128 kbits, for P = 64 and for B = 32 kbits

  4. 20 + 10 + 10 + 20 + 8 × 7=116

    for P, the average index of (10 + 10 + 20)/3 = 13.3 should be used, hence the target bit rates for I = 99.3 kbits, for P = 66.2 kbits and for B = 34.75 kbits.

    1. all equal to 48 kbit/s
    2. 60 + 3 × 20+8 × 15 = 240

      For for P = 48 kbits and for B = 36 kbits

    1. L
    2. L
    3. P
    4. L
  1. 4

    -1

    -5

    0

    2

    0

    0

    -1

    7

    0

    2

    1

    -1

    0

    0

    -1

    0

    1

    0

    1

    -1

    0

    0

    0

    0

    -4

    0

    0

    0

    0

    0

    0

    -2

    0

    0

    0

    1

    0

    0

    0

    0

    0

    1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    2D-events: (0,4)(0,-1)(0,7)(2, -5)(1, 2) (0, 1)(1, -2) (0, -4)(1, 1)(0, 2)(1, -1) (0, 1)(7, -1)(2, -1)(4, 1)(5, 1) (2, -1)EOB (using Figure 6.12, the bits including the sign bit)

    8 + 5 + 11 + 20 + 7 + 5 + 7 + 8 + 6 + 5 + 6 + 5 + 9 + 6 + 8 + 8 + 6 + 2 =132 bits

  2. The base layer events: (0,4)(0,-1)(0,7)(2,-5)(1,2)(0,1)(1,-2)(0,4) + PBP the bits: 8 + 5 + 11 + 20 + 7 + 5 + 7 + 8 + 6 = 77 bits

    The enhancement layer events: (1,1)(0,2)(1,-1)(0,1)(7,-1)(2,-1)(4,1)(5,1) (2,-1) + EOB the bits: 6 + 5 + 6 + 5 + 9 + 6 + 8 + 8 + 6 + 2 = 61

    Total bits 77 + 61 = 138 bits, about 4.5 per cent extra over one-layer coding

  3. base:

    2

    0

    2

    0

    1

    0

    0

    0

    4

    0

    1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    -1

    0

    0

    0

    0

    -2

    0

    0

    0

    0

    0

    0

    -1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    enhancement:

    0

    -1

    0

    0

    0

    0

    0

    -1

    0

    0

    0

    1

    -1

    0

    0

    -1

    0

    1

    0

    1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Base layer events: (0, 2)(1, 4)(2, -2)(1, 1)(2, -1)(0, -2)(2, 1)(10, -1)EOB

    Bits: 5 + 9 + 7 + 6 + 6 + 5 + 6 + 10 + 2 = 56 bits

    Enhancement layer events: (1, -1)(6, 1)(4, 1)(2, -1)(0, 1)(10, -1)(10, 1)(2, -1)

    EOB

    Bits: 6 + 9 + 8 + 6 + 5 + 10 +10 + 6 + 2 = 62 bits

    Total bits = 56 + 62 = 118

    Note: the overall bit rate is less than the one layer, but the distortion will be larger.

  4. Mbit/s, 28.8/8 = 3.5, thus three TV programmes.
    1. for B = 1, α = 1 and β = 10-5
    2. for B = 5, α=0.2 and β = 2 × 10-6
    1. 34 980 Mbits = 4.3725 Gbytes
    2. mean bit rate = 34 980/(90 x 60) = 6.478 Mbit/s
    3. CBR = 90 × 60 × 20/8 = 13.5 Gbytes, peak-to-mean = 3.09
    1. if error in any video bits, packet is in error; P = 47 × 8 × 10-7 = 3.76 × 10-7
    2. if error in the header, packet is lost; P = 5 × 8 × 10-7 = 4 × 10-6
  5. available link rate = 50 × 30 × 10-2 = 15 Mbit/s

    total data to be sent = 34 980 × 53/47 = 39 445.5 Mbits

    time required = 39 445.5/15 Mbits/s = 43 min 50 s

  6. 25 × 4 = 100 Mbits/s, load and the error rate is P = .
  7. With SNR scalability, assume 30 per cent more load, then total load = 100 x 1.3 = 130 Mbit/s, of which 65 Mbit/s is assigned to the base layer. Since the base layer has an absolute priority, then network load for the base layer:

    and at this load P = 1.72 × 10-8.

    For the enhancement layer, it sees the whole load, of 130 Mbits/s, thus the load will be:

    and the error rate will be P = 0.088.

  8. In data partitioning, with 4 per cent extra bits over one layer and 50 per cent to the base layer, the base layer load will be:

    and the error rate P = 2.7 × 10-9. The enhancement layer has a load of 2 x 3.783 = 0.7566 that leads to an error rate of P = 5.3 × 10-5.

  9. With spatial scalability, assuming 50 per cent more bits over one layer and 50 per cent assigned to the base layer, then the allocated bits to the base layer will be 75 Mbit/s. Base layer load is:

    leading to P = 9.5 × 10-8. For the enhancement layer, the load will be more than 100 per cent and the loss probability will be P = 1!

  1. Prepend 0 to all events of problem 3 of Chapter 8, except the last event, where 1 should be appended, and no need for EOB, e.g. first event (0, 4, 0) and the last event (1, 2, -1)
  2. For x, the median of (3, 4, -1) is 3 and for y, the median of (-3, 3, 1) is 1. Hence the prediction vector is (3, 1) and MVD = (2 - 3 = -1; 1 - 1 = 0) = (-1, 0)
    1. thus B1 = 150 - 8=142 and C1 = 115 + 8 = 123

    2. d = -31.25, and d1 = 0, hence B and C do not change.
    1. (3, 4),
    2. (0, -3),
    3. (1, 0.5),
    4. (3, 2.6)
    5. (-1, -1).
  3. In order for a matrix to be orthonormal, multiplying each row by itself should be 1. Hence in row 1 and 3 (basis vectors 0 and 2), their values are 4, hence they should be divided by . In rows 2 and 4 their products give: 4 + 1 + 1 + 4 = 10, hence their values should be divided by .

    Thus the forward 4 × 4 integer transform becomes

    And the inverse transform is its transpose

    As can be tested, this inverse transform is orthornormal, e.g.:

    click to expand

  4. With the integer transform of problem 5, the two-dimensional transform coefficients will be

    431

    -156

    91

    -15

    43

    52

    30

    1

    -6

    -46

    -26

    -7

    -13

    28

    -19

    14

    The reconstructed pixels with the retained coefficients are; for N = 10:

    105

    121

    69

    21

    69

    85

    62

    44

    102

    100

    98

    119

    196

    175

    164

    195

    which gives an MSE error of 128.75, or PSNR of 27.03 dB. The reconstructed pixels with the retained 6 and 3 coefficients give PSNR of 22.90 and 18 dB, respectively.

    With 4 × 4 DCT, these values are 26.7, 23.05 and 17.24 dB, respectively.

    As we see the integer transform has the same performance as the DCT. If we see it is even better for some, this is due to the approximation of cosine elements.

  5. index-0 = QP

    1. index-8 = 2QP
    2. index-16 = 4QP
    3. index-24 = 8QP....index-40 = 32QP
    4. index-48 = 64QP
  6. Compared with H.263, at lower indices H.263 is coarser, e.g. at index-8 the quantiser parameter for H.263 is 8QP, but for H.26L is 2QP etc.

    At higher indices, the largest quantiser parameter for H.263 is 31QP, but that of H.26L is 88 QP, hence at larger indices H.26L has a coarser quantiser.

  1. For A: c0 = c2 = c3 = 1, index = 1 + 4 + 8 = 13, the given frequency table in Appendix D is for prob(0), hence prob(0) = 29789, but since A is a 1 pixel, then its probability prob(1) = 65535 - 29789 = 35746 out of 65535.

    For B: c0 = c1 = c2 = c3 = c4 = 1, and index = 1 + 2 + 4 + 8 + 16 = 31, prob(0) = 6554. As we see this odd pixel of 0 among the 1s has a lower probability.

    For C: c1 = c2 = c3 = c4 = c5 = c7 = 1, and the index becomes 190. Like pixel A the prob(0) = 91, but its prob(1) = 65535 - 91 = 65444 out of 65535, which is as expected.

  2. The chain code is 0, 1, 0, 1, 7, 7. The differential chain code will be: 0, 1, -1, 1, -2, 0 with bits 1+2 + 3 + 2 + 5 + 1 = 14 bits
  3. At level 2 the indices (without swapping) will be 0 for the two blank blocks and index = 27 × 2 + 9 × 2 + 0 + 2 = 74 and index =0 + 0 + 3x2 + 2 = 8 for the two blocks. At the upper level the index is: index = 0 + 0 + 3 x 1 + 1 =4
  4. The coefficients of the shape adaptive DCT will be

    127

    -40

    1

    10

    -15

     

    23

    -7

     

    -6

       

    confined to the top left corner, while for those of the normal DCT with padded zeros, the significant coefficients are scattered all over the 8 × 8 area.





Standard Codecs(c) Image Compression to Advanced Video Coding
Standard Codecs: Image Compression to Advanced Video Coding (IET Telecommunications Series)
ISBN: 0852967101
EAN: 2147483647
Year: 2005
Pages: 148
Authors: M. Ghanbari
Simiral book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net