General Concept

Polynomial Arithmetic and Galois Fields

In the previous section of this chapter, we mentioned that error-protected Reed-Solomon codes are based on the following math components : polynomial arithmetic and the arithmetic of Galois fields . We can t proceed any further until we cover all of these aspects in detail, so let s be patient before assaulting the peaks of higher math. After that, we ll be engaged in pure coding without any sophisticated math.

Polynomial Arithmetic

The 6th section of Art of Programming by Donald Knuth is dedicated to the polynomial arithmetic. There, the following definition is given to this area of math: Formally speaking, polynomial over S is the following expression: u(x)=u n x n + + + u 1 x + u , where un, , u 1 , u coefficients are elements of some algebraic syste m S, and variable x can be considered as a formal symbol without a determining value. Let us assume that algebraic system S represents a commutative ring with 1. This means that S allows addition, subtraction, and multiplication operations satisfying normal requirements: addition and subtraction are associative and commutative binary operations defined over S, multiplication is distributive in relation to addition. There are also elementary unit for addition 0 and elementary unit for multiplication 1, so that a + == a and a * 1 == a for every a belonging to S. Subtraction is an inverse operation in relation to addition; however, no assumptions have been made for division as operation inverse to multiplication. The polynomial 0x n+m + +0x n+1 +u n x n + + + u 1 x + u is considered as identical to u n x n + +u 1 x+u , although formally these polynomials are different.

Thus, instead of presenting data word D , codeword C , and the remainder from the division R as integer numbers (as we did earlier), we can relate them to the appropriate coefficients of a binary polynomial and carry out all further mathematical operations according to the rules of polynomial arithmetic. At first glance, the gain from this approach seems doubtful. However, let s not be hasty. Instead, let s present any number (the first one that comes to mind; for instance, let it be 69h ) into a binary polynomial. Start the built-in Windows application named Calculator, or any other suitable application of your choice, and convert this number into binary form (with a bit of skill, it is possible to carry out this operation mentally: 69h 1101001 .

Well, the rightmost coefficient is equal to 1, then there are two zeroes, followed by 1 briefly speaking, we get the following: 1 x 6 + 1 x 5 + x 4 + 1 x 3 + x 2 + x + 1 . In fact, the bit string 1101001 is one of the forms of writing the above-mentioned polynomial. Certainly, this form of data presentation might lack clarity from the beginner s point of view, but it is very convenient for machine processing. Stop. If 69h already represents a polynomial, then what is the difference between the addition of polynomials 69h and 27h and the addition of integer numbers 69h and 27h ?! Undoubtedly, there is a difference. As Nietzsche has shown, there are no facts. All that exists are their interpretations. Interpretations of numbers and polynomials are different, and mathematical operations over them are carried out according to different rules.

Coefficients in polynomial arithmetic are strictly typified and the coefficient of x k is of a different type than the coefficient of x m (provided, of course, that k ‰  m ). Operations over numbers of different types are not allowed! All coefficients are processed independently and carry to the more significant bit or borrowing from the more significant bit are not taken into account. Let us demonstrate this with the example of adding 69h and 27h :

Listing 2.11: Addition carried out according to the rules of polynomial binary arithmetic (left) and according to the rules of normal binary arithmetic (right)
image from book
 1101001 (69h)      1101001 (69h)  +0100111 (27h)     +0100111 (27h)   -------       -------  1001110 (4Eh)     10010000 (90h) 
image from book
 

Simple calculations show that modulo-2 addition of polynomials gives the same result as their subtraction and magically coincides with the XOR operation. The match with XOR is a purely accidental, though. However, the equivalence of addition and subtraction forces us to review the nature of things, to which most of us are accustomed. For instance, let us recall arithmetic problems such as Mary has one apple. Pete takes it away from her. We gave another apple to Mary. How many apples does Mary have? How many apples would she have had if Pete hadn t robbed her? . From the modulo-2 arithmetic the answer would be: zero and one, respectively. Yes! If Pete didn t take the apple from Mary, 1 + 1 == 0 and the poor girl would not have any apples at all.

However, we are digressing. Let s forget the quarrels between the kids and return to the fictious member of our polynomial and its coefficients. Thanks to their typification and lack of mutual relations, we can process numbers of practically unlimited lengths by simply XOR ing their bits. This is one of the main advantages of polynomial arithmetic that are unnoticeable at first glance, but thanks to which polynomial arithmetic became so widespread.

In our case, however, polynomial arithmetic alone is not sufficient. To implement a Reed-Solomon coder /decoder, we ll also need Galois fields. What are these, you ask?

Galois Fields

As long ago as the 1960s, when computers were iron-cast and 20 MB hard disks resembled washing machines, a beautiful legend was born about a green extra-terrestrial creature that came from stars and recorded the entire British Encyclopedia into a thin light-silvery rod, which the creature took with it as it left. These days, when the 100-GB hard disks are no bigger than a pack of cigarettes, this density of information recording is no longer a wonder . However, the point of this legend is that the extra-terrestrial creature has mastered the technology of writing an infinite volume of information on an infinitely small media, and the British Encyclopedia was chosen just as an example. The alien could copy the contents of all Internet servers by placing only a thin trace on its metal rod. You don t believe it? You have objections? Let us convert British Encyclopedia into digital form and obtain a very large number. Then let us place a decimal point before it, thus converting this number into a very long decimal fraction. Now it only remains to find two numbers, A and B , so that the result of dividing A by B is exactly equal to this decimal fraction with precision to the last digit. Writing these numbers to a metal rod can be carried out by placing a mark on the rod dividing the latter into two sections of lengths that are multiples of the values A and B , respectively. To read information from the rod, it will be enough to measure the lengths of the A and B sections and then divide A by B . The first ten digits following the decimal point will be more or less precise, after which The practice here will outdo the perfect theory, burying the latter under a thick cover of information garbage that results from the impossibility of measuring exactly the geometric size of real-world objects.

In a digital world, the situation is even worse . Every programmer knows only too well that the division of integer and real numbers is subject to rather stringent limitations. First and foremost, division is a very resource-consuming operation, with regard to processor resources. If it s not enough that it is insufficient, the operation is mathematically inaccurate! This means that, if c = a * b , this doesn t necessarily mean that a == c/b ! Thus, normal arithmetic is not suitable for practical implementation of Reed-Solomon codes. Therefore, one has to resort to a special branch of mathematics, namely, the mathematics of finite Galois groups.

Here, we interpret the group as a set of integer values, sequentially numbered from to 2 n “1 , for example: {0, 1, 2, 3} or {00h 01h, 02h, 03h, 04h, 05h, 06h, 07h, 08h, 09h, 0Ah, 0Bh, 0Ch, 0Dh, 0Eh, 0Fh} . Groups containing 2 n elements are called Galois Fields and denoted as follows : GF(2 n ) .

Group members without fail are subject to associative, commutative, and distributive laws. However, they are processed in a way that might, at first glance, seem unnatural :

  • The sum of any two group members is always present in this group.

  • For any group member, designated as a , there always exists an identity member, which usually is denoted as e , and satisfies the following condition: a + e = e + a = a

  • For each member a of a group there is an inverse member ˆ’ a , so that a + ( ˆ’ a) = 0

Let us start with the first thesis. Does this seem like gibberish to you? Suppose that we have the following group: {0, 1, 2, 3} . Is it possible to obtain the number less than or equal to 3 when calculating the value 2 + 3 ?! As it turns out, addition in Galois fields is carried out without accounting the carry, and the sum of two members of a Galois group is equal to: c = (a + b) % 2 n , where the % operation stands for the calculation of the remainder from the division. As applied to our case: (2 + 3) % 4 == 1 . In mathematics, this operation is called modulo-4 addition, or congruence addition .

Naturally, the following question will likely interest you: Does modulo addition find a practical application or is it used only in abstract theoretical construction? Good question. You mechanically carry out this operation many times each day without even noticing or thinking that this is exactly that operation ”addition without taking carry into account. For instance, suppose that you woke up at 6 p.m. and then worked on your computer for 9 hours without pauses. Then you accidentally look at your watch. Provided that the watch actually shows exact time, what will the position of the hour hand be? Obviously, the required value represents the modulo-12 sum of 6 and 9, which is equal to: (6 + 9) % 12=3 . Here is an illustrative example of using Galois arithmetic. And now, as an experiment, let us subtract 6 from 3 (if you find this difficult, just look at your watch).

Now, let us proceed with the most important fact: Since the result of dividing one group member by another group member (naturally, a non-zero one), must be present in this group, then, despite the fact that this is an integer division, this operation will be precise. In fact, the result will be precise rather than approximate! Consequently, if c = a * b , then a == c/b . In other words, multiplication and division are unambiguously and consistently defined for all group members, except for the case of division by zero. At the same time, multiplication doesn t result in the increase of the bit width!

Naturally, this is not a standard multiplication (and, of course, not in every Galois field 2*2=4 ). However, Galois arithmetic must not necessarily correspond to common sense. The most important thing about it is that it works, and works rather well. The existence of hard disks, CD-ROM/DVD drives is the best confirmation of this fact, since they all use this arithmetic for specific purposes of one sort or another.

As was already mentioned, modulo-2 Galois fields have become the most wide-spread in computing. This is because these fields are the most natural from machine processing point of view, since it is binary by the nature.

To implement Reed-Solomon coder/decoder, we will need four basic arithmetic operations: addition, subtraction, multiplication, and division . These operations will be covered in detail in the following few sections.

Addition and Subtraction in Galois fields

Modulo-2 addition in Galois fields is identical to subtraction, and is implemented by the XOR bitwise operation. This aspect was already discussed when studying polynomial arithmetic. Therefore, we will simply provide an example of software implementation of the addition/subtraction function:

Listing 2.12: The function implementing addition/subtraction in Galois fields
image from book
  // This function returns the result of modulo-2 addition (subtraction)   // of two polynomials a and b  int gf_sum(int a, int b)  {     return a ^ b;  } 
image from book
 

Multiplication in Galois Fields

Having opened an elementary school math textbook , we will read there that multiplication represents the operation of addition repeated a number of times. If we have already learned how to carry out addition in Galois fields, should we feel confident that the multiplication function won t cause us any serious difficulties? Unfortunately, no. I always knew that two multiplied by two equals four. However, I never believed this to be an absolute truth. After encountering Galois fields for the first time, I understood how right I was [i] . As it turns out, there are other kinds of mathematics where two multiplied by two is not equal to four, and the multiplication operation is not defined using addition, but, rather, is based on quite a different concept.

In fact, if try to wrap the gf_sum function into the loop, we will get just the same addition, defined in a different way: a * b will be equal to a , if b is even, otherwise it will be equal to zero. Who needs this type of multiplication? Actually, the function implementing real Galois multiplication is so complicated and resource-consuming that to simplify its implementation, it is necessary to transform polynomials into index form, and then add indexes by modulo of GF , after which it is necessary to carry out an inverse transformation of the sum of indexes into polynomial form.

What is the index ? The index is the exponent of the power of two, which produces the required polynomial. For instance, the index of polynomial 8 is equal to 3 ( 2 3 = 8 ), while index of polynomial 2 is equal to 1 ( 2 1 = 2 ). It is quite easy to show that a * b = 2 i + 2 j =2 (i+j) . In particular, 2 * 8 =2 3 +2 1 = 2 (3+1) =4 4 = 16 . Now let us compose the following table and carry out some experiments with it:

Table 2.2: Polynomials (left column) and powers of two corresponding to them (right column)

i

alpha_of[i]

001

002

1

004

2

008

3

016

4

Up to this moment, we operated with the concepts of the customary arithmetic. Therefore, about two thirds of the table fields remained blank. In fact, equations such as 2 x = 3 have no integer solutions. Therefore, some indexes do not correspond to any polynomials! However, since the number of polynomials of any Galois field is equal to the number of possible indexes, we can map them to one another in specifically predefined way, without paying any attention to the fact that this action makes no sense from the customary mathematical point of view. The specific scheme of mapping can be chosen in any way. The only fact that matters here is that the mapping must be consistent., e.g., that it must satisfy all above-listed requirements of the groups (see Galois Fields ).

Naturally, since the final result depends directly on the chosen mapping scheme, both parties (Reed-Solomon coder and decoder) must observe specific agreements. However, different Reed-Solomon coders/decoders can use different mapping schemes, incompatible to one another.

In particular, the Reed-Solomon decoder built into the CD-ROM drive carries out multiplication according to the table provided in Listing 2.13. Having encountered such a table in the disassembled listing of the program being investigated, you ll be able to quickly and reliably identify all functions that are using it.

Example 2.13: Look-up table for GF(256). The leftmost column specifies polynomials/indexes (designated as i ), the second column represents the table of powers of the trivial polynomial 2 (designated as alpha ), the third column contains indexes corresponding to the current polynomial (designated as index )
image from book

i

alpha

index

i

alpha

index

i

alpha

index

000

001

-1

047

035

69

094

113

70

001

002

048

070

29

095

226

64

002

004

1

049

140

181

096

217

30

003

008

25

050

005

194

097

175

66

004

016

051

010

125

098

067

182

005

032

50

052

020

106

099

134

163

006

064

26

053

040

39

100

017

195

007

128

198

054

080

249

101

034

72

008

029

3

055

160

185

102

068

126

009

058

223

056

093

201

103

136

110

010

116

51

057

186

154

104

013

107

011

232

238

058

105

9

105

026

58

012

205

27

059

210

120

106

052

40

013

135

104

060

185

77

107

104

84

014

019

199

061

111

228

108

208

250

015

038

75

062

222

114

109

189

133

016

076

4

063

161

166

110

103

186

017

152

100

064

095

6

111

206

61

018

045

224

065

190

191

112

129

202

019

090

14

066

097

139

113

031

94

020

180

52

067

194

98

114

062

155

021

117

141

068

153

102

115

124

159

022

234

239

069

047

221

116

248

10

023

201

129

070

094

48

117

237

21

024

143

28

071

188

253

118

199

121

025

003

193

072

101

226

119

147

43

026

006

105

073

202

152

120

059

78

027

012

248

074

137

37

121

118

212

028

024

200

075

015

179

122

236

229

029

048

8

076

030

16

123

197

172

030

096

76

077

060

145

124

151

115

031

192

113

078

120

34

125

051

243

032

157

5

079

240

136

126

102

167

033

039

138

080

253

54

127

204

87

034

078

101

081

231

208

128

133

7

035

156

47

082

211

148

129

023

112

036

037

225

083

187

206

130

046

192

037

074

36

084

107

143

131

092

247

038

148

15

085

214

150

132

184

140

039

053

33

086

177

219

133

109

128

040

106

53

087

127

189

134

218

99

041

212

147

088

254

241

135

169

13

042

181

142

089

225

210

136

079

103

043

119

218

090

223

19

137

158

74

044

238

240

091

163

92

138

033

00

045

193

18

092

091

131

139

066

237

046

159

130

093

182

56

140

132

49

141

021

197

180

150

20

219

086

177

142

042

254

181

049

42

220

172

187

143

084

24

182

098

93

221

069

204

144

168

227

183

196

158

00

138

62

145

077

165

184

149

132

223

009

90

146

154

153

185

055

60

224

018

203

147

041

119

186

110

57

225

036

89

148

082

38

187

220

83

226

072

95

149

164

184

188

165

71

227

144

176

150

085

180

189

087

109

228

061

156

151

170

124

190

174

65

229

122

169

152

073

17

191

065

162

230

244

160

153

146

68

192

130

31

231

245

81

154

057

146

193

025

45

232

247

11

155

114

217

194

050

67

233

243

245

156

228

35

195

100

216

234

251

22

157

213

32

196

200

183

235

235

235

158

183

137

197

141

123

236

203

122

159

115

46

198

007

164

237

139

117

160

230

55

199

014

118

238

011

44

161

209

63

200

028

196

239

022

215

162

191

209

201

056

23

240

044

79

163

099

91

202

112

73

241

088

174

164

198

149

203

224

236

242

176

213

165

145

188

204

221

127

243

125

233

166

063

207

205

167

12

244

250

230

167

126

205

206

083

111

245

233

231

168

252

144

207

166

246

246

207

173

169

229

135

208

081

108

247

131

232

170

215

151

209

162

161

248

027

116

171

179

178

210

089

59

249

054

214

172

123

220

211

178

82

250

108

244

173

246

252

212

121

41

251

216

234

174

241

190

213

242

157

252

173

168

175

255

97

214

249

85

253

071

80

176

227

242

215

239

170

254

142

88

177

219

86

216

195

251

255

000

175

178

171

211

217

155

96

     

179

075

171

218

043

134

     
image from book
 

Using this table, you can easily carry out the transform from the polynomial form into the index form, and vice versa. How should you use this table? Let us assume that we need to multiply the polynomials 69 and 96 . Let us find the number 69 in column i . The corresponding alpha value is 47 . Let us memorize it (or write it down) and proceed with the number 96 , for which the value of alpha is equal to 217 . Let us carry out a modulo-256 addition of 47 and 217 . As a result, we will get the following value: (217 + 47) % 256 = 8 . Now, let us transform the result from the index form into the polynomial form. To do so, it is necessary to find the number 8 in column i and its corresponding polynomial ”3 ”in column index . (If we carry out an inverse operation, by dividing 3 by 69 , we will get 96 , which proves the consistency of multiplication and division operations, as well as in the entire Galois arithmetic as a whole). This is rather fast, although sometimes it is unclear why the table is composed in this way rather than in some other. The worst thing is that the trustworthiness of the result cannot be felt intuitively, because all of these concepts represent a pure abstraction. This significantly complicates program debugging (it is rather difficult to debug anything where you don t fully understand the operating principle).

Still, it is not necessary to enter manually the multiplication table from the keyboard. It is possible to generate this table automatically, on the fly, at program execution time. A possible implementation of such a generator looks as follows:

Listing 2.14: The procedure of generating the lookup table of quick polynomial multiplication
image from book
 #define m 8    // The exponent of the RS polynomial  // (equal to 8, according to the ECMA-130  // standard)  #define n 255 //  n=2**m-1  // (length of the codeword)  #define t 1 // number of errors to be corrected  #define k 253 //  k = n-2*t  // (length of the data word)  // prime generator polynomial  //  According to ECMA-130: P (x) = x    8    + x    4    + x    3    + x    2    + 1  int p[m+1]={1, 0, 1, 1, 1, 0, 0, 0, 1};  int alpha_to [n+1];      // table of exponents                           // of the prime member  int index_of [n+1];      // index table                           // for fast multiplication  //-------------------------------------------------------------------- // Generating the look-up table for fast multiplication  // for GF(2^m) on the basis of prime generator polynomial  // from p[0] to p[m].  //  // Look-up table:  //       index>polynomial from alpha_to []  //       contains j=alpha^i,  //       where alpha is the trivial member,  //       usually equal to 2  //       a ^ - operation of raising to the power  //       (not XOR!);  //  //       polynomial form > index from  //       index_of [j=alpha^i] = i;  //  //  Simon Rockliff  //------------------------------------------------------------------- generate_gf()  {     int i, mask;     mask=1; alpha_to[m]=0;     for (i = 0; i < m; i++)     {        alpha_to[i]=mask;        index_of [alpha_to[i]] = i;        if (p[i] != 0) alpha_to[m] ^= mask;        mask <<= 1;     } index_of [alpha_to [m]] = m; mask >>= 1;     for (i = m + 1; i < n; i++)     {        if (alpha_to[i-1] >= mask)           alpha_to[i] = alpha_to [m] ^ ((alpha_to [i-1] ^mask) <<1);        else           alpha_to[i] = alpha_to [i-1        index_of[alpha_to[i]] = i;     } index of [0] = -1;  } 
image from book
 

The multiplication function itself looks rather trivial. In fact, it fits within five lines of code. In most software implementations of Reed-Solomon coder/decoder that I have seen, the division operation is not even implemented as a separate procedure but, rather, is carried out directly where it is called.

Listing 2.15: A function of fast multiplication in Galois Fields using a table
image from book
  // The function returns the result of multiplication   // of two polynomials in Galois fields  int gf_mul(int a, int b)  {     int sum;      if (a == 0   b == 0) return 0;   // Some optimization                                         // won't hurt.      sum=alpha_of[a]+alpha_of[b],       // Calculating the sum                                         // of polynomial indexes      if (sum >= GF-1) sum -= GF-1;      // Bringing the sum to                                         // the GF module      return index of [sum];             // Transforming the                                         // result to the                                         // polynomial form                                         // and return the result  } 
image from book
 

Division in Galois Fields

Division in Galois fields is carried out practically the same way as multiplication, with the only exception that the indexes are not added but, rather, subtracted from one another. Actually, a/b == 2 i /2 j == 2 (i “j) . To transform this presentation from polynomial to index form and vice versa, the above-provided look-up table can be used.

Naturally, do not forget about the fact that, however perverse Galois fields might seem, even abstract arithmetic doesn t allow division by zero. Therefore, the division function must be supplied with the appropriate check.

Listing 2.16: The function of fast division of polynomials in Galois fields
image from book
  // This function returns the result of division of two   // polynomials, a and b, in Galois fields.   // In case of attempt of division by zero, the function   // returns -1.  int gf_div(int a, int b)  {      int diff;      if (a == 0) return 0;             // Some optimization                                        // wont hurt.      if (b == 0) return -1;            // Division by zero                                        // is not allowed!      diff = alpha_of[a] - alpha_of[b]; // Calculating the                                        // difference of indexes      ff += GF-1;                       // Bringing the                                        // difference to the                                        // GF module      return index_of [diff];           // Transforming the                                        // result to the                                        // polynomial form                                        // and returning                                        // the result  } 
image from book
 

Simplest Practical Implementations

Ancient-model hard disks developed by IBM represent a good example of the implementation of the Reed-Solomon coder/decoder. The IBM 3370 model had very simple and illustrative Reed-Solomon coder/decoder of the (174, 171) type in the Galois field GF(256) . In other words, this coder/decoder operated with 8-bit cells (2 8 =256) , and there were 3 bytes of checksum field per 171 data bytes. As a result, this produced a codeword having the length of 174 bytes. As we will see later, all three checksum bytes were calculated absolutely independently from one another. Consequently, the Reed-Solomon coder/decoder operated only with one byte, which significantly simplified its architecture.

In contemporary hard disks, the Reed-Solomon coder/decoder has become significantly more complicated, and the number of check bytes has grown immensely. As a result, in implementations of this Reed-Solomon coder/decoder, one has to operate with numbers of unnatural lengths (about 1,408 bits or more). Consequently, the source code bristled with a thick layer of additional checks, loops , and functions that significantly complicate its understanding. Furthermore, most manufacturers of computer hardware have recently migrated to the hardware implementations of Reed-Solomon coders/decoders, which are implemented entirely by a single chip. In other words, despite all of our respect for technological progress, it is much better to study the basic operating principles of the Reed-Solomon coders/decoders on the examples of ancient models.

The listing provided below shows a fragment of the original code of the firmware for the IBM 3370 hard disk:

Listing 2.17: The key fragment of the Reed-Solomon coder/decoder from the original code of the IBM 3370 hard disk firmware
image from book
 for (s0 = s1 = sm1 = i = 0; i < BLOCK_SIZE; ++i)  {     s0  =                           s0 ^ input[i];     s1  =         GF_mult_by_alpha[ s1 ^ input[i]];     sm1 = GF_mult_by_alpha_inverse[sm1 ^ input[i]];  }; 
image from book
 
Listing 2.18: The key fragment of the Reed/Solomon coder/decoder from the IBM 3370 hard disk firmware
image from book
 err_i = GF_log_base_alpha [GF_divide [s1] [s0]];  // Calculating                                                    // the error                                                    // syndrome  input[err_i] ^= s0;                               // Correcting the                                                    // erroneous byte 
image from book
 

Can we understand how it works? With regard to the s0 variable, everything is clear: This variable stores the checksum calculated according to the trivial algorithm. As you probably remember, addition in Galois fields is carried out by logical XOR operation. Therefore, s0+= input[i] .

The goal of the s1 variable is more difficult to clarify. To understand the idea of its transformations, we must know the contents of the GF_mult_by_alpha table. Although, for the sake of brevity, this function is not provided here, it is easy to guess its goal because it has the speaking name : the contents of s1 are added to the next byte of the data flow being controlled, and then multiplied by so-called primitive term , designated as alpha , and equal to 2. In other words, s1 = 2 * (s1+input[i]) .

Now, let us assume that one of the bytes of the data flow becomes inverted (let us designate its position as err_i ). Then the index of the inverted byte can be determined by the trivial division of s1 by s0 . Why is this the case? In fact, s1 = 2 * (s1 + input[i]) is nothing other than the multiplication of the data word by the generated polynomial dynamically generated on the basis of its primitive member alpha . The checksum of the data word stored in the s0 variable actually represents the same data word, simply represented in more compact form. As already mentioned, if the error took place in position x , then the remainder generated by the division of the codeword by the generated polynomial will be equal to k = 2 x . Now, it only remains to calculate x by the given k , which in our case is carried out by means of looking up the GF_log_base_alpha table, which stores the pairs of mappings between k and 2 x . As long as the position of the erroneous byte is detected , it can be corrected by means of XOR ing it with the calculated checksum s0 ( input [err_i] ^=s0 ). Naturally, this is true only for single errors. The above-described algorithm is unable of correcting errors comprising two or more corrupted bytes per data block. In fact, the third byte of the checksum sm1 is present for exactly this purpose. It protects the decoder from incorrect attempts at error correction, i.e., prevents error correction when the number of errors exceeds 1. If the expression s1/s0 == sm1 * s0 becomes false, then the disk controller can record the fact of multiple errors, but should report that these errors cannot be recovered.

As is well-known, the defects of a magnetic surface tend to group errors rather than cause single errors. To compensate for the weakness of the error-correction algorithm, IBM engineers resorted to byte interleaving . The IBM 3370 hard disk had the interleave of 3:1, which means that the first byte of the first block goes first, followed by the first byte of the second block, then by the first byte of the third block, and only after that ”by the second byte of the first block. This trick increased the corrective capabilities of the hard disk from a single error to the sequence of three corrupted bytes However, if bytes that were not directly adjacent to one another were subject to corruption, then the corrective capability once again dropped down to one corrupted byte per block. However, the probability of such an event was significantly lower.

Naturally, this kind of algorithm can be implemented not only in hard disks. By varying the block size and interleave factor, you ll be able to ensure more or less strong protection having higher or lower information redundancy. In fact, let us suppose that we have N sectors on the disk. Then, having divided them into blocks containing 174 sectors each and having dedicated 3 sectors per block for storing the checksum, we will be able to restore at least N/174 disk sectors. Based on an average disk capacity of 100 GB (which corresponds to 209,715,200 sectors), we will be able to restore up to 1,205,259 sectors ”even in the case of their total physical destruction. At the same time, only 2 percent of disk space will be reserved for storing checksums. Most people would agree that the situations, when the destruction of a hard disk is so fast that correcting capabilities of the Reed-Solomon code prove to be insufficient for restoring information, are quite rare (provided, of course, that incipient disk corruption is noticed in due time, and provided that the interleave factor is chosen correctly ”so that sectors belonging to the same plate of the hard disk were served by different correcting blocks. Otherwise, the corruption of the surface of one disk plate results in a group error that cannot be recovered by this algorithm).

What should we do if the entire hard disk is damaged? In this case, the most reasonable approach is to create an array of several disks storing useful information mixed with correcting codes. The main drawback of such an approach is its inefficiency with arrays comprising a small number of hard disks. The reasonable minimum is as follows: four data disks and one disk for check information. In this case, the loss of any of the data disks will be compensated by the check disk that remained functional. The damaged check disk can easily be replaced by a new one, and all check codes will be recalculated. However, if two disks fail simultaneously , this means a catastrophe. An array containing 15 disks, among which 12 are data disks, and 3 disks are reserved for check codes, provides a significantly higher level of fault tolerance. It tolerates simultaneous failure of any two disks, and under favorable circumstances, even simultaneous failure of three disks can be overcome .

Actually, there is nothing new in all this information. In fact, appropriate RAID controllers are available in practically any computer store. However I can hardly imagine the cost of a RAID controller of level 15, and I am not sure that it will be possible to make it work (based on my own experience, RAID controllers even of basic levels are extremely error-prone , whimsical, and have extremely high requirements both to hardware and to the OS environment). Finally, practically all RAID controllers require either absolutely identical disks or, at least, ones that are very similar in their characteristics and/or interfaces). What if they are not available?

Software implementation of RAID, actively promoted by the author of this book, is free from all of the above-listed drawbacks. You can use disks of different geometry and even of different capacity. At the same time, nothing limits you to concentrating them in a single location. On the contrary, disks can be accessed even via the network. Furthermore, it is not necessary to reserve the entire disk for RAID storage. You can arbitrarily dedicate the desired part of the disk space for this purpose.

How can we use this in practice? The first idea that comes to mind is to use part of the hard disk space for storing redundant information that will allow the restoration of the disk contents in the event of a failure. If several computers are joined to form a network then, under conditions of relatively low overhead, we will be able to restore any of the hard disks in any computer connected to a network, even if its information is totally destroyed . This is thanks only to the redundant information distributed among all of the other computers. It would be hard to conceive of a more reliable system for information storage! This scheme was implemented by the author in the LANs of several enterprises . It has proven its high survivability , flexibility, and a wide range of functional capabilities. The need to constantly carry out backups of the hard disk contents was automatically eliminated. This is more than urgent under conditions of peer-to-peer networks, where there is no dedicated server! On the other hand, such networks exist, and they are not rare. (No, I do not promote the development of this type of network or claim that it is a good solution. I am simply noting the fact that they exist and are not going to be eliminated totally in the nearest future.)

The only drawback of software RAID implementation is its low performance. In particular, by installing software RAID on a server processing thousands of queries per second and modifying a large number of files intensively, you won t get a performance gain, but the concept of performance itself is rather relative. Having a sufficiently fast processor, it is possible to carry out information encoding/decoding on the fly without any losses in throughput! On the other hand, if read operations dominate over write operations, the installation of software RAID is a natural solution, since the integrity control of the information being read is carried out at the hardware level of the drive itself. Consequently, when using systematic encoding (i.e., data words separate from parity bytes), the Reed-Solomon decoder doesn t need to interfere with this process, and its help is needed only when part of the information is badly damaged. Realistically, this doesn t happen often. Thus, it is not worth paying a lot to firms specializing on hardware RAID implementations, especially since they don t provide home users and small businesses the attention that they deserve.

[i] In other words, when I press the button switch, I know that the light will be switched on. However, I am not sure (suppose that the electrician has cut the wires, or the light bulb has fused, etc.) The same thing is typical also for math. That set of trash that most of us are fed up from the secondary school, or even from colleges, is not a math. This is a set of voodoo rites that students must repeat to obtain the result, but which do not allow to grasp the underlying ideas ”zen of math. I don t know whether or not this is for the better, but I feel it my duty to remind you that math taught in many educational institutions has no more relation to the real math than programming has to torturing the mouse when struggling with Microsoft Word or installing Windows.



CD Cracking Uncovered. Protection against Unsanctioned CD Copying
CD Cracking Uncovered: Protection Against Unsanctioned CD Copying (Uncovered series)
ISBN: 1931769338
EAN: 2147483647
Year: 2003
Pages: 60

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net