Reed-Solomon codes - The DART-Europe E-theses Portal

Reed–Solomon (RS) codes are block-based error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960 [84]. They are applied in all areas requiring reliable data. The most prominent applications are in digital communications and storage. These codes were defined for noisy channel where errors, i.e. the data trans-mitted is modified during the transmission, occur during transmission. Reed-Solomon codes can also be used to correct burst of errors. A burst of errors is a series of bits in the codeword which are received in error.

RS codes are linear block codes and belong to the family of BCH (Bose, Ray-Chaudhuri et Hocquenghem) codes, which are a class of cyclic error-correcting codes constructed using finite fields. For more details see [46], [12], [10], [65].

An RS code is a cyclic linear block code over the finite field Fq of length n, with n a divisor of q −1. When n = q−1 the RS code is said to be primitive. Reed-Solomon codes are maximum distance separable codes as they attain the Singleton bound in Def-inition1.8. In the sequel of this thesis, we only consider non-trivial non-binary RS codes whereq > n >2.

Let ν be the number of errors and µ the number of erasures occurred during the transmission. In presence of errors only, any pattern ofνerrors can be decoded, provided that

ν≤jd−1 2

k. (1.6)

Similarly, in presence of erasures only, any pattern ofµerasures can be decoded, provided that

µ≤d−1. (1.7)

Fanny Jardel 10

In presence of both errors and erasures the condition of successful decoding is given by:

2ν+µ≤d−1 (1.8)

The following list sums up several advantages of Reed-Solomon codes:

• RS codes are useful when a code is required of length less that the size of the field.

• They have the highest possible minimum distance, i.e. they are MDS.

• They allow to correct a burst of errors.

• They are convenient for constructing other codes, as we shall see in Section1.4.

Reed-Solomon encoder

The BCH view: The codeword as a sequence of coefficients

Consider an [n, k, d]_q Reed-Solomon code. A codeword c(x) is generated using a special polynomial called generator polynomial and denoted g(x). Codewords are valid if they are exactly divisible by the generator polynomial. Let α be a primitive root of F(q). The general form of the generator polynomial is given by:

g(x) = (x−α^b)(x−α^b+1). . .(x−α^b+d−2). (1.9) Usually, but not always, b= 1.

Reed-Solomon encoder can be systematic or non-systematic.

Definition 1.9. LetC be a [n, k, d]qlinear or non-linear code. The encoder is called sys-tematic if the information data is embedded in the encoded output. An non-syssys-tematic encoder scrambles the message, i.e. the encoded message does not contain the input symbols.

In the case of non-systematic encoder, further computations are necessary to recover the information data.

A non-systematic codeword is constructed using:

c(x) =g(x)i(x), (1.10)

where i(x) is the information polynomial. In the case of systematic codeword, the previous equation is modified as follows:

c(x) =i(x)x^n−k+ [i(x)x^n−k] mod(g(x)), (1.11) where [i(x)x^n−k] mod(g(x)) denotes the parity polynomial.

Fanny Jardel 11

Reed & Solomon’s original view: The codeword as a sequence of values

This method is the original one proposed by Reed and Solomon. Leti= (i₀, i₁, . . . , i_k−1), ij ∈Fq, be the message to be encoded. Let define i(x) such as:

i(z) =

k−1

j=0

i_jz^j. (1.12)

The codeword is obtained by evaluating i at n different points α⁰, α¹, . . . , αⁿ⁻¹ of the finite fieldFq. Thus,

c= (i(1), i(α), . . . , i(αⁿ⁻¹)). (1.13) We show thatcis a RS codeword by verifying thatc(x) =Pn−1

j=0 c_jx^j hasα, α², . . . , αⁿ⁻¹ as zeros.

Reed-Solomon decoder

Another advantage of RS codes is the ease with which they can be decoded, namely, via an algebraic method known as syndrome decoding.

Consider an RS code of parameters [n, k, d]_q. Letc(x) be the transmitted codeword and r(x) be the received codeword. The received codeword for an error-only channel can be expressed as

r(x) =c(x) +e(x), (1.14)

wheree(x) is an unknown error polynomial. The error polynomial e(x) is

e(x) =e_n−1xⁿ⁻¹+e_n−2xⁿ⁻²+. . .+e₁x¹+e₀, (1.15) where at mostj

d−1 2

k coefficients are nonzero. Suppose thatν errors occur at unknown positioni₁, i₂, . . . , i_ν, thene(x) can be written

e(x) =ei1xⁱ¹+ei2xⁱ² +. . .+eiνxⁱ^ν, (1.16) whereei_l is the unknown magnitude of thelth error. Note that the number ν of errors is also unknown. A Reed-Solomon decoder attempts to identify the number of errors, their positions and their values (magnitudes). Based on this,e(x) can be calculated and subtracted fromr(x) to get the original messagec(x).

RS codes can be decoded either in time domain or in frequency domain. Decoding in the frequency domain, using Fourier transform techniques, can offer computational and implementation advantages. For this reason, decoding algorithms are presented in the frequency domain in the sequel and they are considered for channels with algebraic errors which is the most general case.

For erasures-only channels, the positions of erased symbols are known. Thus, the erasure-locator polynomial and its roots are already determined. In this case, one way

Fanny Jardel 12

to decode is to guess the erased symbols and then apply any error-correction proce-dure. Reed-Solomon codes over erasures-only channels can be decoded via Maximum-Likelihood (ML) decoder. This is a non-iterative decoder based on a Gaussian reduction of the parity-check matrix of the RS code.

For a channel making both errors and erasures, the procedure is similar to the one with errors but erasure filling is incorporated into the algorithm. The reader should refer to chapter 7 in [10] for more details about these algorithms.

In presence of errors, the most common ones follow this general outline:

1. Calculate the syndromes S_j for the received vector.

2. Determine the number of errors and the error-locator polynomial Λ(x).

3. Calculate the roots of the error-locator polynomial to find the error locations.

4. Determine the error-evaluator polynomial Γ(x) and the error values.

5. Correct the errors.

Calculate the syndromes

The syndrome values are formed by evaluatingr(x) atαⁱ¹, . . . , αⁱ^ν. Thus the syndromes are

S_j =r(α^j) =c(α^j) +e(α^j) =e(α^j). (1.17) If there is no error, S_j = 0 for all j = 1, . . . , d−1. If the syndromes are all zero, then the decoding is achieved.

We define the lth error values by Y_l = e_i_l , and the lth error location numbers by X_l=αⁱ^l, forl= 1, . . . , ν. With these notations the syndrome values are given by

S₁ =Y₁X₁+Y₂X₂+. . .+Y_νX_ν S₂ =Y₁X₁²+Y₂X₂²+. . .+Y_νX_ν²

...

S_d−1 =Y₁X₁^d−1+Y₂X₂^d−1+. . .+Y_νX_ν^d−1

The decoding problem is now reduced to solve a system of non-linear equations. The solution to this system is unique. The set of non-linear equations is too difficult to solve directly, therefore a process called locator decoding is computed.

Find the symbol error locations

This step consists in determining the error-locator polynomial Λ(x) in order to find the error locations. Λ(x) is defined as

Λ(x) =

l=1

(1−xX_l) = 1 + Λ₁x+ Λ₂x²+. . .+ Λ_νx^ν (1.18)

Fanny Jardel 13

The zeros of Λ(x) are the reciprocals X_l⁻¹:

Λ(X_l⁻¹) = 0 (1.19)

If the coefficients of Λ(x) are known, then we can find its zeros to obtain the error locations.

We multiply Equation (1.18) byY_lX_l^j+ν forj= 1, . . . , d−1 and setx=X_l⁻¹:

Y_l(X_l^j+ν + Λ₁X_l^j+ν−1+. . .+ Λ_νX_l^j) = 0. (1.20) Equation (1.20) is true for alll. Then we sum these equations from l= 1 tol=ν. For each j,j= 1, . . . , d−1, we have:

l=1

Y_l(X_l^j+ν+ Λ1X_l^j+ν−1+. . .+ ΛνX_l^j) = 0, (1.21) from which we can deduce:

Sj+ν + Λ1Sj+ν−1+ Λ2Sj+ν−2+. . .+ ΛνSj = 0. (1.22) Finally, asν ≤j

d−1 2

if 1≤j≤d−1−ν, the subscripts always specify known syndromes.

The set of linear equations relating the syndromes to the coefficients is:

Λ1Sj+ν−1+ Λ2Sj+ν−2+. . .+ ΛνSj =−Sj+ν, j= 1, . . . , d−1−ν. (1.23) This system can be written as

S_k =−

j=1

Λ_jS_k−j, k=ν+ 1, . . . , d−1. (1.24)

The connection weights (Λ₁,Λ₂, . . . ,Λ_ν) for the smallestν ≤j

d−1 2

for which the system (1.24) has a solution can be found with different algorithms. The most known are:

• Peterson algorithm.

• Berlekamp–Massey algorithm.

and are explained in chapters 6 and 7 in [10]. Peterson algorithm has a complexity proportional to ν³ due to the inversion of a ν ×ν matrix. Consequently, when ν is large another method should be used to find the polynomial Λ. The Berlekamp–Massey algorithm is an alternative procedure. It allows to solve the system of equations (1.24) but in a modified version. In this case, the system of equations has to be solved for the smallest ν ≤ d−1. This algorithm is iterative and has to create the minimum-length linear recursion (Λ(x), ν). At each iteration r = 1, . . . , d−1, a discrepancy polynomial

Fanny Jardel 14

∆(x) based on a current instance of Λ^(r)(x) with an assumed number of errors Lr is calculated. The discrepancy polynomial is defined as

∆(x) = Λ(x)S(x). (1.25)

At iteration r, in order to compute the shortest linear recursion (Λ^(r), L_r) the earlier iterations are used in the discrepancy polynomial which is defined as,

∆_r=

Lr−1

j=1

Λ^(r−1)_j S_r−j. (1.26)

Then, Λ^(r)(x) and Lr are adjusted until ∆r attains zero and then the next iteration is computed untilr =d−1.

At this point of the procedure the polynomial Λ(x) is known, we have to determine its roots in order to find the error positions. Usually, because there is only a finite number of field elements to check, the simplest way to find the roots is by trial and error, for example by using the Chien search algorithm [10]. At the end of this step the error-locator polynomial can be written as:

Λ(x) = (1−αⁱ¹x)(1−αⁱ²x). . .(1−αⁱ^νx), (1.27) where the error positions i₁, i₂, . . . , i_ν are now known.

Calculate error magnitudes

In order to correct the errors the next step is to find the error values. One method is to compute the Gorenstein–Zierler algorithm after the Peterson procedure to find the error magnitudes. This algorithm is based on matrix inversions. The whole procedure is known as Peterson-Gorenstein–Zierler algorithm.

Another process depends on the error-evaluator polynomial Γ(x). This polynomial is defined as:

The Forney algorithm, built on Γ(x), can be used to find the error values. Thelth error value is given by

Y_l=−X_lΓ(X_l⁻¹)

Λ⁰(X_l⁻¹). (1.30)

Fanny Jardel 15

Using the error values and their locations, errors can be corrected by subtracting error values at error locations.

Another iterative method to find the error-locator polynomial and calculate error magnitudes is based on the Extended Euclidean algorithm. This algorithm computes the polynomial greatest common divisor of two univariate polynomials for locating least common divisor of polynomials Λ(x)S(x) and x^d−1. The key idea of this algorithm is based on the following equation

Λ(x)S(x) =Q(x)x^d−1+ Γ(x). (1.31)

The goal is not to find the least common divisor. The extended Euclidean algorithm can find a series of polynomials of the form

A_r(x)S(x) +B_r(x)x^d−1=R_r(x). (1.32) At each iteration r until the degree of R_r is greater than j

d−1 2

the polynomials are updated for the next iteration as follows:

Q= R_r−2 Rr−1

R_r=R_r−2−QR_r−1, A_r=A_r−2−QA_r−1. Once the degree ofR_r(x)<j

d−1 2

, thenA_r(x) = Λ(x) and R_r(x) = Γ(x).

As in the previous algorithm, an exhaustive method is applied to find the roots of Λ(x).

These roots correspond to the error locations. The Forney algorithm is computed to cal-culate the error magnitudes from Γ(x). Then the error can be subtracted in the received codeword.

Extended Euclidean algorithm tends to be more used in practice because it is easier to implement, but the Berlekamp–Massey algorithm tends to lead to more efficient hard-ware and softhard-ware implementations.

Dans le document The DART-Europe E-theses Portal (Page 69-75)