New steganographic schemes using binary and quaternary codes

(1)

UNIVERSITÉ MOHAMMED V-AGDAL

Faculté des Sciences

N◦ d0ordre 2655

Thèse de Doctorat

Intitulée

New Steganographic Schemes

Using Binary and Quaternary Codes

présentée par

Houda JOUHARI

Le 1e_r _{juillet 2013} Devant le Jury :

Pr. El Hassan Saidi Professeur à l'Université Mohammed V Agdal - Rabat,

Membre de l'Académie Hassan II des Sciences et Techniques Président Pr. Abdelmalek Azizi Professeur à l'Université Mohammed Ier _Oujda,

Membre de l'Académie Hassan II des Sciences et Techniques Rapporteur Pr. Mohamed Bouhdadi Professeur Habilité à l'Université Mohammed V Agdal - Rabat Examinateur Pr. M. I. Garcia-Planas Professeur à Universitat Politecnica de Catalunya Barcelona Rapporteur Pr. Said El Hajji Professeur à l'Université Mohammed V Agdal - Rabat Rapporteur Pr. El Mamoun Souidi Professeur à l'Université Mohammed V Agdal - Rabat Directeur

(2)

(3)

Avant-Propos

Les travaux présentés dans cette thèse ont été éectués au laboratoire MIA (Mathé-matiques, Informatiques et Applications) du Département de Mathématiques de la Faculté des Sciences de Rabat, sous la direction du Professeur El Mamoun Souidi, auquel je tiens à exprimer ma profonde gratitude pour ses conseils, ses compétences et le soutient dont il m'a gratié tout au long de ma thèse.

Je tiens en premier lieu à adresser mes plus vifs remerciements à Monsieur El Hassan Saidi, Professeur à l'Université Mohammed V Agdal et membre de l'Académie Hassan II des Sciences et Techniques pour l'honneur qu'il me fait de présider ce jury de thèse.

Je remercie également les membres du jury en qualité de rapporteurs, Monsieur Said El Hajji, Professeur à l'Université Mohammed V Agdal, Monsieur Abdelmalek Azizi, Professeur à l'Université Mohammed premier d'Oujda et membre de l'Académie Hassan II des Sciences et Techniques et Madame M. I. Garcia-Planas, Professeur à l'Université Polytecnique de Catalunya Barcelona pour l'intérêt qu'ils ont porté à mon travail, ainsi que Monsieur Mo-hamed Bouhdadi Professeur Habilité à l'Université Mohammed V Agdal pour l'honneur qu'il me fait de participer à ce jury en tant qu'examinateur.

J'adresse également mes remerciements à Messieurs Patrice Parraud Professeur à l'école militaire de Saint-Cyr et Ayoub Otmani Professeur à l'Université de Rouen de m'avoir donné tant de précieuses suggestions dans mon travail de recherche. Leurs conseils m'ont permis de gagner beaucoup de temps et les nombreuses discussions que nous avons eues m'ont permis de franchir des étapes importantes dans mes travaux.

En particulier, je tiens à exprimer mes remerciements les plus sincères à mon père pour la compréhension et le soutien énorme qu'il m'a fournit tout au long de ma vie. Enn, je remercie mes amis pour leur encouragement, amitié et leur soutien moral.

(4)

(5)

Résumé

Parallèlement à l'emploi de la cryptographie pour protéger les communications, on voit apparaître un usage grandissant des techniques de dissimulation d'information "stéganogra-phie". Cette dernière consiste à dissimuler des messages dans un support numérique anodin et c'est ce support qui est envoyé sur un canal publique de transmission. Du fait que l'utilisation de moyens cryptographiques est encore contrôlée, voire interdite dans beaucoup de pays et les réseaux parfois surveillés, la stéganographie s'impose pour communiquer plus librement mais surtout de manière furtive.

En faisant l'hypothèese que l'attaquant est passif, nous présentons dans cette thèse in-titulée "New steganographic schemes using binary and quaternary codes" de nouvelles tech-niques de dissimulation d'information basées sur les codes correcteurs d'erreurs sous deux angles. Celui de l'amélioration des schémas existants en utilisant les codes binaires et celui de la construction de nouvelles méthodes en utilisant les codes sur l'anneau Z4. Les travaux

de recherche présentés dans cette thèse ont fait l'objet de publications dans des journaux et conférences internationales.

Dans la première contribution, nous présentons un nouveau schéma stéganographique basé sur les fonctions booléennes, qui est capable d'améliorer l'ecacité de dissimulation et de parvenir à une amélioration signicative par rapport à l'approche classique qui est celle du syndrome.

La deuxième contribution consiste à construire un schéma basé sur les codes de Reed-Muller d'ordre 1, qui vont nous permettre de réduire la complexité de dissimulation suivant deux méthodes. La première est basée sur la transformée de Walsh Hadamard rapide nous permettant de trouver la distance d'un membre q d'une classe par rapport à tous les 2k

mots du code en O(n.ln2_(n)) _{au lieu de O(n(n − 1)2}k₎ _{opérations binaires. La deuxième}

repose sur l'utilisation d'un critère de somme dans le décodage par liste de ces codes, nous a permis de reconstruire tous les mots du code se trouvant dans la boule de rayon (1 − ε)d en O(n.ln2_(min{ε−2_{, n}))}_{opérations binaires. Dans cette contribution, nous avons montré que}

les codes de Reed-Muller d'ordre 1 sont de bons candidats pour la conception de schémas stéganographiques ecaces en terme de complexité.

La troisième évoque l'utilisation des codes cycliques sur l'anneau Z4 dans la

construc-tion d'un nouveau schéma de dissimulaconstruc-tion d'informaconstruc-tion. Du fait que quelques codes non iii

(6)

tats expérimentaux montrent qu'en utilisant ces codes, nous pouvons augmenter l'ecacité de dissimulation par rapport à celles existantes.

La quatrième décrit une application des codes de Preparata en stéganographie. Nous prenons comme support de communication des images en niveau de gris. Les résultats obtenus montrent que l'utilisation de la Z4-linéarité des codes de Preparata nous permet de

dissimuler une grande quantité d'information en comparaison avec la méthode de syndrome basée sur les codes de Hamming étendus tout en maintenant une bonne qualité visuelle de l'image.

Finalement, la dernière contribution décrit une autre méthode adaptative pour dissimuler une grande quantité de données secrètes dans les images. Les résultats expérimentaux montrent que la méthode proposée reposant sur les codes Z4-linéaire de Goethals conduit

(7)

List of Figures

1.1 Systematic block encoding for error correction . . . 8

1.2 Simplied model of a data transmission system . . . 9

1.3 Quadrature Phase-Shift Keying . . . 30

2.1 Prisoner's Problem Solved . . . 34

2.2 Pure steganography processes . . . 36

2.3 Public steganography processes . . . 36

2.4 Secret steganography processes . . . 36

2.5 Compromise to be made in steganography . . . 38

2.6 Binary entropy function . . . 44

4.1 The embedding eciency comparison of steganographic methods based on nonlinear Goethals codes and on BCH(3, m) linear codes. . . 71

5.1 Embedding eect on Lena image . . . 79

5.2 Embedding eect on Baboon image . . . 80

5.3 Histogram of Lena for Our proposed method . . . 81

5.4 Histogram of Lena for Extended Hamming method . . . 82

5.5 Histogram of Baboon for Our proposed method . . . 82

5.6 Histogram of Baboon for Extended Hamming method . . . 83

6.1 Embedding eect on Lena image . . . 90

6.2 Embedding eect on Baboon image . . . 91

6.3 Histogram of Lena for Our proposed method . . . 93

6.4 Histogram of Lena for F 5 method . . . 94

6.5 Histogram of Baboon for Our proposed method . . . 95

6.6 Histogram of Baboon for F 5 method . . . 96

(12)

(13)

List of Tables

1.1 Standard array of the code C (Example 1) . . . 12

1.2 Syndrome table of the code C (Example 2) . . . 13

3.1 Truth table for Example 10 . . . 48

4.1 Performances of the Goethals codes Gm for m = 3, 5, 7, 9, 11 . . . 70

4.2 Performances of the BCH(3, m) codes for m = 3, 5, 7, 9, 11 . . . 70

5.1 Comparaison of amount of embedded data between the proposed method and the Extended Hamming syndrome coding . . . 81

5.2 Comparaison on PSNR values between the proposed method and the Ex-tended Hamming syndrome coding method after embedding 3.460 bytes . . . 81

6.1 Comparaison of amount of embedded data between the proposed method and the F 5 method . . . 92

6.2 Comparaison on PSNR values between the proposed method and the F 5 method after embedding 2.957 bytes . . . 92

(14)

(15)

Introduction

W

hen communication about highly condential topics is needed, there are twoways to do that. The rst method is to encipher the message in such a way that no one else can read it. In this case, the encryption is obvious, and when inter-cepted, it is clear that the sender and the receiver are communicating secretly and people may be able to tell that a secret message is being transmitted; they just can't read it. This technique is called cryptography. The second method is to hide the fact that a message is being transmitted. This can be done by hiding the message in a nonsuspicious object, this technique is called steganography. Both techniques, namely cryptography and steganogra-phy are respectively useful in covert communication, but each serves a dierent purpose. Cryptography provides the means for secure communications; steganography provides the means for secret communication.

Steganography combined with cryptography would be the most secure way to go. Be-cause the existence of an encrypted communication draws attention to it, hiding it in another le uppers up your security level substantially.

In contrast with cryptography, where the enemy is able to detect, intercept and modify the transmitted information, steganography is used primarily when the fact of communi-cating needs to be kept secret. This is accomplished by embedding the secret messages within some communication support (called covers) as is said above. Today's typical covers are computer les, mainly images, video or audio les; but in fact, whatever an electronic document contains irrelevant or redundant information, it can be used as a cover for hiding secrets.

There are several countries where the use of cryptography is actually illegal or submitted to strong condition. Moreover, the debate occurring over export controls on encryption, and steganography may be a logical recourse for hiding transmission of encrypted data in some countries.

So steganography is becoming more and more popular in the network communication and the steganography methods themselves are rapidly evolving and becoming increasingly sophisticated.

The need for reliable steganalytic tools capable of detecting hidden messages has recently increased due to anecdotal evidence that steganography is being used by terrorists. In Oc-tober 2001, the New York Times newspaper published an article claiming that al-Qaeda has used steganography to hide messages into simages, and then sent these via e-mail to prepare

(16)

and execute the September 11, 2001 terrorist attack [25].

The design of a steganographic system has (at least) two facets: rstly, the choice of accu-rate covers and the search for staccu-rategies to modify them in an imperceptible way; this study relies on a variety of methods, including psycho-visual and statistical criteria. Secondly, the design of ecient algorithm for embedding and extracting the information.

Our goal in this thesis is to concentrate our attention on this last problem, by con-structing new and more ecient steganographic algorithms using coding theory techniques. Recall that error-correcting codes are commonly used for detecting and correcting errors in data transmission. Their use in steganography is not new. The rst data hiding method that demonstrates a close relationship between steganographic protocol and error correcting code is the matrix embedding method, introduced rstly by Crandall [22] and analyzed by Bierbrauer [10]. Matrix embedding, is applied to reduce the number of required changes of the cover by carefully selecting the positions used for embedding. The F 5 algorithm pro-posed by Westfeld [85] is the rst implementation of the matrix encoding concept to reduce modication of the quantized DCT coecients.

Depending on the choice of the linear code, dierent matrix embedding schemes can be constructed. Jessica and Soukal have presented in [28] two new approaches to matrix embedding for large payloads, one based on a family of codes constructed from simplex codes and the second one based on random linear codes of small dimension. They showed that random linear codes provide good embedding eciency and their relative embedding capacity densely covers the range of large payloads, which make them suitable for practi-cal applications, and that matrix embedding using simplex codes is more computationally ecient. They also introduced a new concept of an average distance to code as it is more relevant and directly related to embedding eciency as currently used in steganography. They derived asymptotic bounds on the average distance to code to better contrast the performance of the proposed codes to the theoretically achievable embedding eciency.

In [9] Bierbrauer and Jessica have used the covering codes to improve embedding e-ciency and security of steganographic schemes. In their paper, they describe several families of covering codes constructed using the blockwise direct sum of factorizations.

To study how to design steganographic algorithm more eciently, the problem of con-structing linear steganographic codes (stego-codes) is converted to an algebraic problem by introducing the concept of tth _{dimension of vector space by Zhang and Li in [87], where a}

method of constructing linear stego-codes was proposed by using the direct sum of vector subspaces. Then, some bounds on the length of stego-codes are obtained, from which the maximum length embeddable (MLE) code is brought up. It is shown that there is a corre-sponding relationship between MLE codes and perfect error correcting codes. Furthermore the classication of all MLE codes and a lower bound on the number of binary MLE codes are obtained based on the corresponding results on perfect codes. The hiding redundancy is dened to valuate the performance of stego-codes.

(17)

pro-3

vide embedding eciency higher than binary ones, by converting the binary message into a ternary format. Each pixel of the image can carry a ternary message by choosing adding or subtracting one to or from the gray value. They also proposed a novel method that improves the embedding eciency of binary covering functions by fully exploiting the infor-mation contained in the choice of addition or subtraction in the embedding. The improved scheme can perform equally well with, or even outperform, ternary covering functions with-out ternary conversion of the message. For example, the covering functions COV (3, 31, 12), COV (3, 127, 18), COV (3, 511, 24) proposed by Bierbrauer and Fridrich [9], called ”BF ” schemes for brevity, can be improved by appending two pixels and called correspondingly ”BF + 2” schemes. For some covering functions COV (R, n, k), it is hard to calculate the average number of changes Ra. In this case, they substituted the largest changes R for Rato

calculate a lower bound of embedding eciency as k/R, and evaluated the embedding e-ciency of the ”BF ” and ”BF +2” schemes with this lower bound. They showed that for the ”BF ” schemes, the performance parameters, (embedding rate, lower bound of embedding eciency), are increased respectively when using ”BF + 2”.

They also proposed a new method to construct stego-codes, showing that not just one code but a family of stego-codes can be generated from one covering code by combining Hamming codes and wet paper codes [90]. This method can enormously expand the set of embedding schemes as applied in steganography. By using the stego-code families of low-density generator matrix (LDGM) codes, we obtain a family of near optimal embedding schemes for binary steganography and ±1 steganography, respectively, which can approach the upper bound of embedding eciency for various chosen embedding rate.

With Matrix Embedding based on Hamming Codes, coding theory has entered the eld of steganography. Even though this class of structured codes has been used successfully in practical systems to minimize the number of embedding changes, thus maximizing embed-ding eciency, further developments, such as Wet Paper codes, were based on random codes instead. To redraws attention to structured codes, which are built according to deterministic rules, Schönfeld and Winkler [69] studied BCH codes for embedding with syndrome coding, using either a structured matrix H as in Matrix Embedding, or a generator polynomial g(x). They proposed dierent approaches for embedding without locked elements, which dier in the tradeo reached between embedding complexity and eciency. As some prac-tical systems allow more secure steganography if embedding constraints, in terms of locked elements, that are respected, they have demonstrated how BCH codes can be employed in a Wet Paper codes scenario as well. Based on a deduced analogy between code rate and the maximum number of lockable elements, they found appropriate code parameters for a given fraction of locked elements in the cover, complexity constraints, and desired probability of successful embedding.

Then, in [70] one deals with strategies to dramatically reduce the complexity for em-bedding based on syndrome coding. In contrast to existing approaches, their goal was to keep the embedding eciency constant, i.e., to embed less complexly without increasing the average number of embedding changes, compared to the classic Matrix Embedding scenario.

(18)

Generally, their considerations are based on structured codes, especially on BCH codes. However, they are not limited to this class of codes. They proposed dierent approaches to reduce embedding complexity concentrating on both syndrome coding based on a parity check matrix and syndrome coding based on the generator polynomial.

Binary BCH codes have been investigated and seem to be good candidates for designing ecient steganographic schemes [69]. A work [26] aimed to study a family of Reed-Solomon codes shows that they are twice better than BCH ones with respect to the number of locked positions which cannot be modied; in fact, they are optimal. They also considered a new and more general problem, mixing wet papers (locked positions) and simple syndrome coding (low number of changes) in order to face not only passive but also active wardens. They showed that Reed-Solomon codes improve the management of locked positions during embedding, hence ensuring a better management of the distortion; they are able to lock twice the number of positions compared to the previous studied codes, as binary BCH codes.

In [6] Some algebraic decoding algorithms up to the error correcting capacity are trans-formed into a maximum likelihood decoder by the use of a limited exhaustive search. This algorithm is directly inspired from those proposed in [19] in the context of electronic sig-nature. It remains exponential, however it becomes practicable for some small BCH and Goppa codes (typically, with an error correcting capacity until 4).

A proposed method in [55] consists of use the Majority logic decoding algorithm, for embedding the message in the cover image, the extraction function is as usual based on syndrome coding.

Munuera showed in [59] some relations between steganographic algorithms and error-correcting codes. By using these relations he give a method to construct good steganographic protocols and deduced their properties from those of the corresponding codes.

A new steganographic technique based on the convolution product of two codes and using an embedding process which is dierent from the usual embedding scheme is evaluated in [65]. The performance of this new technique is computed and comparisons with the well-known theoretical upper bound, Hamming upper bound are established. They also showed that this technique performs better than the one obtained using basic LSB steganography or the basic F5 algorithm.

In [23] an extension of Westfeld's F 5 algorithm based on Linear error-block (LEB) codes is introduced. The authors showed that with a good choice of the LEB code parameters, more bits from the cover-image may be exploited, in such a way that the probability for each bit to be ipped is related to its inuence on the image quality.

To avoid the enemy's attempts, the statistical features between stego-images and cover images should be as similar as possible for better resistance to steganalysis. For secret communication, the resistance of steganography against steganalysis is very important for information security. To ensure the security against the Regular-Singular-attack and χ2

detection, a new steganographic method based on graph coloring problem (GCP) is presented in [24] which consists in locating the optimal positions of the pixels in the cover image to make the message dicult to be detected.

(19)

5

Since error-correcting codes can be used to construct good steganographic protocols and study their properties, in this thesis, we will discuss a way of using the theory of coding to establish new steganographic schemes. We started in [44] by presenting an improved data hiding scheme using some properties of Boolean function, where we showed that the construction using nonlinear extracting function is of great interest because it can give us a steganographic scheme with higher embedding eciency, which is the average number of message bits carried by one embedding change in the cover data, than linear extracting functions currently used.

Later on, we proposed in [43] to focus on a particular family of error-correcting code: First-Order binary Reed-Muller codes, which enables us to embed more rapidly compared to the existing methods by reducing the complexity of their decoding algorithm in two ways. The rst algorithm based on the fast Walsh transform that allows us to nd the Hamming distances by evaluating distances between the coset member q and all 2k

code-words in O(n.ln2_(n)) _{binary operations instead of O(n(n − 1)2}k₎ _{ones. The second one}

based on the list decoding algorithm for RM(1, m) codes, allows us to reconstruct all code-words located within the ball of radius (1 − ε)d, where ε > 0, from the member coset in O(n.ln2_(min{ε−2_{, n}))} _{binary operations.}

In [46], we extended the construction of new steganographic schemes to the quaternary case. We show that certain families of nonlinear codes can achieve better performance for application in steganography than simple linear codes currently in use, and that the theory of covering functions should not be restricted to the binary case. In order to build our steganographic schemes over the Galois ring GR(Z4, m), we extended the

Peterson-Gorenstein-Zierler decoding algorithm for binary linear codes to decode cyclic codes over Z4. Then, we described a new embedding/extracting scheme using cyclic codes over Z4 that

improves embedding eciency and security compared to the schemes based on linear binary codes.

Afterwards, we presented in [45] a new steganographic scheme over the Galois ring GR(Z4, m), based on the Z4-linearity of Preparata nonlinear codes. We showed that the

proposed scheme can conceal a large amount of data in a grayscaling image and can maintain a good image quality as well as compared to the syndrome coding method with Perfect linear Hamming codes of the same length as Preparata codes. In [47], we described another steganographic scheme based on the Z4-linear Goethals codes that allows us to increase the

embedding capacity without inuencing on the image quality compared to the F 5 method. We will constrain ourselves to (digital) image steganography only to be able to practi-cally demonstrate our results. The approach and nal results from this thesis can be easily used for steganography in other digital media, such as video, music etc.

This thesis is divided into six chapters. Our contributions are presented in the four last chapters. The rst chapter deals with some important denitions and some properties of linear codes that will be used along the whole thesis. In this chapter, we present an example of linear codes such as Reed-Muller and BCH codes. The encoding and decoding algorithm

(20)

for cyclic codes is explained in detail. The Z4-codes are studied and the binary image of

these codes via the Gray map is considered.

The chapter 2 contains introduction to steganography. We use so-called prisoners' prob-lem to describe the model of invisible communication. We describe some methods for hiding information in digital images. Then, the connection between coding theory and steganogra-phy is formally established.

In the chapters 3, we describe our new embedding schemes for binary case. We give two dierent schemes, one enable us to increase the embedding eciency and the second approach allows us to embed the message more rapidly by reducing complexity of embedding following two proposed methods.

In the chapters 4, 5 and 6, we present our new steganographic schemes over Z4. We show

that some cyclic codes over Z4 can improve embedding eciency and security of embedding.

Then we show that quaternary covering functions can provide embedding capacity higher than binary ones and can maintain a good image quality as well, by using the Z4-linearity

of Preparata and Goethals codes. These last chapters contain implementation details of the proposed schemes. The experiment is done with various size of images and also with various size of messages.

(21)

Chapter 1 Error Correcting Codes

T

he theory of error correcting codes aims to detect and correct the errors duringthe transmission of data. It enhances the quality of data transmission and provides better control over the noisy channels.

This chapter is devoted to error correcting codes and particularly to block codes, where we give basic denitions and known results about some block codes which will be used in the subsequent chapters. First, we dene linear codes that have been the most studied types of codes. Due to their algebraic structure, they are easy to construct, encode and decode. Then, we describe the Reed-Muller codes as one of the simplest and important families of linear codes. Next, the family of BCH codes are dened. The study of codes over the ring Z4 attracted great interest through the work of Hammons and Kumar [36] showing how

several well-known families of nonlinear binary codes are intimately related to linear codes over Z4. So far, we present the basic theory of linear codes over Z4 including the connection

between these codes and binary codes via the Gray map. We also describe cyclic codes over Z4. For an extensive treatment of the theory of error-correcting codes, we refer the reader

to the textbook by MacWilliams and Sloane [52] and [40] on the topic.

The error correcting codes are tools that improve reliability of information exchange on a noisy channel. They are used in most modern communication technologies like WIFI, ADSL, mobile phones, satellite communications, etc. They are also used to store information in hard drives, USB, CD, DVD..

1.1 Basic Denitions

The principle of error correcting codes is to introduce redundant symbols to previous infor-mation symbols in order to detect or correct errors that may occur in the process of storage or transmission.

In a basic (and practical) form of error-correcting codes, redundant symbols are added to information symbols to obtain a coded sequence or codeword. For the purpose of illustration, a codeword obtained by encoding with a block code is shown in Figure 1.1. Such an encoding is said to be systematic. This means that the information symbols always appear in the

(22)

rst k positions of a codeword. The remaining (n − k) symbols in a codeword are provided by some function applied to the information symbols, which brings redundancy that can be used for error correction/detection purposes. The set of all code sequences is called an error correcting code, and will be denoted by C.

nsymbols

Information

Redundan y

k symbols

n₋ksymbols

Figure 1.1: Systematic block encoding for error correction

Let X be a non empty set. A distance on X is a map d : X × X → R+ _{such that for all}

x, y ∈ X:

d(x, y) = 0 ⇐⇒ x = y ;

d(x, y) = d(y, x) and d(x, y) ≤ d(x, z) + d(z, y)

Denition 1 Let A 6= ∅ be a nite set and An_{= A × A × · · · × A}_{be the cartesian product.}

A code of length n over A is a nonempty subset C of An_.

Let C be a code. The minimum distance of C is

d(C) = min{d(x, y)|x, y ∈ C, x 6= y}, (1.1)

The error correcting capability t of a code C is t = d−1 2

. In this case, we say that C is t-error correcting.

For a xed length n, the Hamming distance between x, y ∈ An _{is dened as follows :}

dH(x, y) = card{i|xi 6= yi i = 1, · · · , n} (1.2)

The minimum distance d is a simple measure of the goodness of a code. For a given length and number of codewords, a fundamental problem in coding theory is to produce a code with the largest possible minimum distance d.

The fact that the spheres of radius t of codewords are pairwise disjoints, immediately implies the following elementary inequality, commonly referred to as the Sphere Packing Bound or the Hamming Bound.

Theorem 1 (Sphere Packing Bound) Let Aq(n, d) denote the maximum number of

code-words in a code C over Aq (where the alphabet set Aq has q elements) of length n and

minimum distance d.

Then, the Hamming bound is: Aq(n, d) ≤ qn Pt i=0 n i(q − 1)i (1.3) where t =j(d−1) 2 k.

(23)

9 1.2. LINEAR CODES We see that when we get equality in the bound, we actually ll the space An

q with disjoint

sphere of radius t. In other words, every vector in An

q is contained in precisely one sphere

of radius t centered on a codeword. When we have a code for which this is true, the code is called perfect t-error correcting code. That means, we have an equality in 1.3.

1.2 Linear Codes

Denition 2 An [n, k] linear code C is a k-dimensional linear subspace of the vector space Fnq. If the code has minimum distance d we shall write [n, k, d] linear code.

Denition 3 An k × n matrix with rows that are basis vectors for a linear [n, k] code C is called a generator matrix of C.

Proposition 1 If G is a generator matrix of an [n, k] code C, then

C = {xG | x ∈ Fkq} (1.4)

1.2.1 Encoding

An [n, k] binary linear code C contains qk_{codewords, corresponding to q}k_{distinct messages.}

We identify each message with a k-tuple u = (u1, u2· · · , uk) ; where the components ui are

elements of Fq.

We encode using the generator matrix G by mapping the message u to the codeword c = uG = (c1, · · · , cn). Here uG is a codeword since, by matrix block multiplication, it is

a linear combination of the rows of G. The following diagram 1.2 provides a rough idea of encoding information over transmission system.

Noise Communi ation hannel De oder information message k symbols odeword nsymbols re eivedword odewordestimated En oder

Figure 1.2: Simplied model of a data transmission system

Denition 4 The Hamming weight ωH(x) of a word is the number of nonzero component

of x. The minimum Hamming weight of a code C is the minimum of ωH(c)over all nonzero

(24)

Note that for a linear code the minimum distance is equal to the non zero lowest weight. For u = (u1, · · · , un), v = (v1, · · · , vn) ∈ Fnq, the scalar product of u and v is dened by

hu, vi =Pn

i=1uivi. If hu, vi = 0 then u and v are called orthogonal.

Denition 5 Let C be a linear code, the dual code of C, denoted by C⊥_{, is the sub-space}

of vectors which are orthogonal to all codewords of C,

C⊥ = {u ∈ Fnq|hu, ci = 0, f or all c ∈ C} (1.5)

The dual code C⊥_{has dimension n−k, in fact, it is an [n, n−k] linear code with a generator}

matrix H = GT_{, where G}T _{denotes the transposed matrix of G .}

If C ⊆ C⊥ _{then C is called a self-orthogonal code and, if C = C}⊥ _{then C is called a}

self-dual code.

Denition 6 Let C be an [n, k] linear code. An (n − k) × n generator matrix H for C⊥ _is

called a parity-check matrix of C.

If G is a generator matrix of C, then G is a parity check matrix of C⊥_{. Therefore, x ∈ F}n q

is a codeword of C if and only if xHT _{= 0}_.

The minimum distance of an [n, k] linear code can be determined using a parity check matrix as follows:

Theorem 2 The minimum distance of an [n, k] code is d if and only if • Every d − 1 columns of H are linearly independent,

• There exist d columns of H which are linearly dependent.

In general it is not dicult to calculate a check matrix for a code, given a generator matrix G. In particular, if the generator matrix G has the special form

G = h

Ik×k | Ak×(n−k)

i

then the parity check matrix is H =

h −AT

(n−k)×k | I(n−k)×(n−k)

i

Two linear codes C1 and C2 are permutation equivalent provided, when there is a

per-mutation of coordinates which transforms C1 into C2. This permutation can be described

using a permutation matrix, which is a square matrix with exactly one 1 in each row and column and 0 elsewhere. Thus C1 and C2 are permutation equivalent provided if there is

a permutation matrix P such that: G1 is a generator matrix of C1 if and only if G1P is a

generator matrix of C2. The eect of applying P to a generator matrix is to rearrange the

columns of the generator matrix.

Proposition 2 Two linear codes are equivalent if one can be obtained from the other by a combination of

(25)

11 1.3. DECODING (i) permutation of the columns of the parity-check matrix of the code;

(ii) multiplication of a xed column by a nonzero scalar.

Remark 1 A generator matrix for a vector space can always be reduced to an equivalent reduced row echelon form spanning the same vector space, by permutation of its rows, mul-tiplication of a row by a non-zero scalar, or addition of one row to another. Note that any combinations of these operators with (i) and (ii) of Proposition 2, will generate equivalent linear codes.

1.2.2 Extending codes

We can create longer codes by adding a coordinate. There are many possible ways to extend a code but the most common is to choose the extension so that the new code contains only even weight vectors. If C is an [n, k, d] code over Fq, the extended code Cb is dened to be the code

b

C = {(x0, x1, x2, · · · , xn) ∈ Fn+12 |(x1, x2, · · · , xn) ∈ C with x0+ x1+ · · · + xn= 0}

In factCb is an [n + 1, k, bd]code, where bd = dor d + 1.

Let G and H be generator and parity check matrices, respectively, for C. Then a generator matrixGb for Cb can be obtained from G by adding an extra column to G so that the sum of the coordinates of each row ofGb is 0. A parity check matrix forCbis the matrix

b H =       0 ... H 0 1 1 · · · 1      

This construction is also referred to as adding an overall parity check [52].

1.3 Decoding

We now wish to use the structure of linear codes to aid in their decoding. If c is transmitted and y is received, then the channel noise has the eect of adding to c an error vector y = c+e, that is e = y − c ∈ Fn

q. The decoding problem is then: given y, we have to estimate either c

or e. The weight of e is the number of positions in which c and y dier.

1.3.1 Standard Array Decoding

Let C be an [n, k] code over Fq. Then the relation dened by

x R y ⇐⇒ x − y ∈ C (1.6)

is an equivalence relation over Fn q.

For a in Fn

(26)

Lemma 1 Suppose that a+C is a coset of a linear code C and b ∈ a+C, then b+C = a+C. Fnq is just the union of qn−k distinct cosets of a linear [n, k] code C, each containing qk

elements.

Denition 7 A standard array for an [n, k]-code is a qn−k _{by q}k _{array where:}

• The rst row lists all codewords (with the 0 codeword on the extreme left) • Each row is a coset with the coset leader in the rst column

• The entry in the ith _{row and j}th _{column is the sum of the i}th _{coset leader and the j}th

codeword.

From the denition of the error e, we learn that the received vector y and the error pattern e belong to the same coset. While we do not know y ahead of time, the cosets of C can be calculated in advance. We look for vectors coset leader of minimum weight in each coset. Notice that while the minimum weight of a coset is well-dened, there may be more than one vector of that weight in the coset, that is, there may be more than one coset leader. Usually we choose and x one of the coset leaders. Always 0 is the unique coset leader for the code C itself.

Example 1 For example, the [5, 2]-code C = {00000, 01101, 10110, 11011} has a standard array as follows: 00000 01101 10110 11011 10000 11101 00110 01011 01000 00101 11110 10011 00100 01001 10010 11111 00010 01111 10100 11001 00001 01100 10111 11010 11000 10101 01110 00011 10001 11100 00111 01010

Table 1.1: Standard array of the code C (Example 1)

Also, the leftmost column contains the vectors of minimum weight enumerating vectors of weight 1 rst and then using vectors of weight 2. Note also that each possible vector in the vector space appears exactly once.

In practice, decoding via a standard array requires large amounts of storage: a code with 32 codewords requires a standard array with 232 entries. Other forms of decoding, such as syndrome decoding, are more ecient.

(27)

13 1.3. DECODING

1.3.2 Syndrome Decoding

In the syndrome decoding method, the dual code and parity-check matrices are used. Denition 8 Let C be an [n, k] code of parity-check matrix H. The syndrome of a vector x ∈ Fnq is the vector s ∈ Fn−kq dened by s = xHT.

Let H be a parity-check matrix for the [n, k] linear code C. We have already mentioned that the vector c is in the code C if and only if

cHT = 0 (1.7)

We interpret the Equation 1.7 as saying that a received vector y and the corresponding error vector e introduced by the channel will have the same syndrome, namely that of the coset to which they both belong. Instead of syndrome dictionary storing in the entire standard array, we only need to store a syndrome dictionary (or syndrome table) containing all possible syndromes {s1 = 0, · · · , sqn−k} together with coset leaders {e₁ = 0, · · · , e_qn−k}

such that eiHT = si. The syndrome table has only 2n−k entries, which is smaller than qn

entries of the standard array.

In decoding, when y is received, rst calculate the syndrome s = yHT_{. Next look up s}

in the syndrome dictionary as s = si. Finally decode y to bebc = y − ei. Example 2 Consider the [5, 4, 3] code with parity-check matrix

H =    1 1 1 0 0 1 0 0 1 0 0 1 0 0 1   

For instance, the received vector y = (1, 0, 1, 1, 1) has syndrome yHT _{= (0, 0, 1). To decode}

using syndromes we rst write out our syndrome dictionary, the rst column containing the transposes of all possible syndromes.

Syndrome Coset leader

100 00100 010 00010 110 10000 001 00001 101 01000 011 11000 111 10001

Table 1.2: Syndrome table of the code C (Example 2)

We then look up this syndrome in our dictionary presented in Table 1.2 and discover that the corresponding coset leader is e = (0, 0, 0, 0, 1). We therefore assume that this is the error

(28)

that occurred and decode y to the codeword b

c = y − e = (1, 0, 1, 1, 1) − (0, 0, 0, 0, 1) = (1, 0, 1, 1, 0).

1.4 Reed-Muller Codes

In this section, we introduce the binary Reed-Muller codes that were rst constructed and explored by Muller [58], and the majority logic decoding algorithm for decoding these codes was described by Reed [64]. There are several ways to describe Reed-Muller codes and the major advantages of these codes is their relative simplicity to encode messages and decode received transmissions [57].

A code of this family is identied by two parameters, usually denoted r and m, respec-tively called order and number of components. Historically, a Reed-Muller code of order 1 and 5 components, which has 64 words of length 32 and correction capability 7, was used by Mariner 9 to transmit black and white photographs of Mars, see [82].

1.4.1 Recursive denition of RM(r, m)

First we present a recursive denition of these codes:

Denition 9 The rth _{order Reed-Muller code of length 2}m _{will be denoted by RM(r, m),}

0 ≤ r ≤ m.

1. RM(0, m) = {00 · · · 0, 11 · · · 1}, RM(m, m) = {0, 1}2m

= F22m

2. RM(r, m) = {(x, x + y)|x ∈ RM(r, m − 1), y ∈ RM(r − 1, m − 1)} for 0 < r < m. So RM(0, m) is the all ones word and the zero word, and RM(m, m) is all words of length 2m.

This recursive denition of the codes translates into a recursive denition of the generator matrix of RM(r, m), which we will denote by G(r, m), as follows:

G(0, m) = [11 · · · 1] G(m, m) = " G(m − 1, m) 0 · · · 01 #

and for 0 < r < m, G(r, m) is dened by G(r, m) =

"

G(r, m − 1) G(r, m − 1) 0 G(r − 1, m − 1)

#

An interesting property of the recursive denition is that it makes easy the proving of certain properties of Reed-Muller codes.

(29)

15 1.4. REED-MULLER CODES 1. length n = 2m_;

2. The minimum distance d = 2m−r_;

3. dimension k = Pr i=0 m i ; 4. RM(r, m) ⊆ RM(r + 1, m), r > 0; 5. RM(r, m)⊥_{= RM (m − 1 − r, m)}_{, r < m.} 1.4.2 Boolean functions and RM(r, m)

As mentioned above, the Reed-Muller codes are linear [n, k, d] codes with n = 2m_{, k =}

Pr i=0 m i , and d = 2m−r_.

In this section, we will give an another denition of these codes, by using Boolean functions (or binary polynomials), which is more suitable for decoding. With this denition, RM (r, m) codes become close relatives of BCH codes and RS codes, all members of the class of polynomial codes.

Boolean functions

Let m be a positive integer and n = 2m_{. A Boolean function f of m variables is a function}

of the form f : Fm

2 → F2. Let Bm be the set of all Boolean functions from Fm2 to F2. Every

function f ∈ Bm can be identied with a vector form (f(u0), f (u1), · · · , f (u2m₋₁)) ∈ F2 m

2 ,

where ui is the binary representation of the integer i with digits in reverse order (low order

digit rst). The u0, u1, · · · , u2m₋₁ is the standard ordering of vectors in Fm₂ . Thereafter, we

will denote the vector form (f(u0), f (u1), · · · , f (u2m₋₁))by bf.

Example 3 Let f : F2

2 → F2 dened by f(x0, x1) = x0 + x1. The ordered 2-tuples are

(0, 0), (1, 0), (0, 1) and (1, 1), and the corresponding vector of length 4 to the function f is b

f = (0, 1, 1, 0).

Assume that we are given a vector space F2m

2 . Then we consider the ring Rm =

F2[x0, x2, · · · , xm−1]. There exists a bijection between elements of Rm and F2

m

2 (in fact,

an isomorphism of rings between (Rm, +, ∗)and (F2

m

2 , +, ∗)).

Denition 10 A Boolean monomial is an element p ∈ Rm of the form:

p = xa0 0 x a2 2 · · · x am−1 m−1 where ai ∈ N and i ∈ Zm.

Therefore, a Boolean polynomial is simply, a linear combination (with coecients in F2) of Boolean monomials, and any element of Rm that may be thought of as a Boolean

(30)

Then, if f ∈ Bm, we can write f (x0, · · · , xm−1) = X (a0,··· ,am−1)∈Fm2 C_(a₀_{,··· ,a}_m−1₎xa0 0 · · · x am−1 m−1

where C(a0,··· ,am−1)∈ F2. The degree of f is dened as

max (_m−1 X i=0 ai|C(a0,··· ,am−1)6= 0 ) .

Properties of Reed-Muller Codes

The Reed-Muller codes can be dened in a simple way in terms of Boolean functions. Denition 11 The binary Reed-Muller code RM(r, m) of order r and length n = 2m_{, for}

0 ≤ r ≤ m is the set of all vectors bf where f ∈ Bm has the degree at most r.

We dene a function fI, that will be used to encode and decode the Reed-Muller codes.

Given a subset I ⊆ {0, 1, · · · , m − 1} fI(x0, · · · , xm−1) = Y i∈I (xi+ 1) = ( 1, if xi = 0 for all i ∈ I 0, otherwise.

(fI is a Boolean function from Fm2 to {0, 1}). We note the corresponding vector form of fI

by bfI.

Example 4 Let m = 3, so n = 23_{. If I = {1, 2} then f}

I(x0, x1, x2) = (x1+ 1)(x2+ 1). The

vector form of f{1,2} is formed by taking each of the elements (x0, x1, x2) ∈ F32 (using the

standard ordering), so f{1,2}(0, 0, 0) = 1, f{1,2}(1, 0, 0) = 1,f{1,2}(0, 1, 0) = 0,f{1,2}(1, 1, 0) =

0,f{1,2}(0, 0, 1) = 0,f{1,2}(1, 0, 1) = 0, f{1,2}(0, 1, 1) = 0 and f{1,2}(1, 1, 1) = 0, therefore

b

f{1,2}= (1, 1, 0, 0, 0, 0, 0, 0)

For each ui ∈ Fm2 , fI(ui)fJ(ui) = fI∪J(ui) and thus

b fI· bfJ = P2m−1 i=0 fI(ui)fJ(ui) = P2m−1 i=0 fI∪J(ui) = ωH( bfI∪J)(mod 2)

where ωH(x) is the Hamming weight of x.

We will denote the set {0, 1, · · · , m − 1} by Zm. The Reed-Muller code RM(r, m)

can be dened as the linear code h{ bfI|I ⊆ Zm, |I| ≤ r}i. We claim that for all r ≤ m,

S = { bfI|I ⊆ Zm, |I| ≤ r}is a linearly independent set, and thus a basis for RM(r, m). By

counting the number of vectors bfI with I ⊆ Zm and |I| ≤ r, we have for an RM(r, m):

k = 1 +m 1 +m 2 + · · · +m r .

Of course the vectors bfI can be arranged in any order to form a generating matrix Gr,m for

RM (r, m). We can recursively construct an RM(r + 1, m + 1) code from an RM(r, m) and RM (r + 1, m)code.

(31)

17 1.4. REED-MULLER CODES Theorem 4 RM(r + 1, m + 1) = {( bf , bf +_bg) f or all bf ∈ RM (r + 1, m) and_bg ∈ RM (r, m)} Encoding is done, as for any linear code, by multiplying a given message by Gr,m. Then

any codeword c can be written as a sum of products as below:

c = X

I⊆Zm,|I|≤r

vIfb_I

where the message digits are labelled vI corresponding to the row bfI of Gr,m.

Decoding Reed-Muller Codes

We shall decode the Reed-Muller codes using an easily implementable process known as majority logic decoding [35]. To understand this, we shall need some preliminary results. For I ⊆ Zmwe set Ic= Zm\I. Let HI= {u ∈ Fm2 |fI(u) = 1}be a subspace of Fm2. From the

denition of HI, we dene another function fI,t(u) = fI(u + t) = 1if and only if u + t ∈ HI.

And the value of fI,s(u)fJc_,t(u) =Q

i∈I(xi+ si+ 1)

Q

j∈Jc(xj+ tj+ 1)remains the same for

every choice of xk∈ {0, 1}, k ∈ Zm\(I ∪ Jc). Of course, nding the number of places where

fI,s(u)fJc_,t(u) = 1 immediately gives bf_I,s· bf_Jc_,t, so we have the following result.

Lemma 2 Let I and J be subsets of Zm, with |I| ≤ |J|. For any s ∈ HIc and for any

t ∈ HJ,

b

fI,s· bfJc_,t= 1 if and only if I = J

The following result which is the basis of the decoding scheme that we shall use can be easily obtained.

Corollary 1 If c is a codeword in RM(r, m) and if |J| = r, then vJ = c · bfJc_,t for any

t ∈ HJ.

Lemma 3 Let J ⊆ Zm. For any vector e of length 2m, e· bfJc_,t = 1for at most ω_H(e)values

of t ∈ HJ.

We can now describe the decoding algorithm as follows. Let y = c+e be a received word where c is a codeword in RM(r, m), so c = P_I⊆Z_mvIfb_I, where |I| ≤ r. Let J ⊆ Z_m be a set of size r. Then e · bfJc_,t = 0 for at least |H_J| − ω_H(e) values of t in H_J, for such values

of t that we have(from Corollary 1):

y · bfJc_,t = c · bf_Jc_,t +e · bf_Jc_,t

= c · bfJc_,t

= vJ

So if 2ωH(e) < |HJ|, as t ranges over the elements of HJ, more than half of the y · bfJc_,t will

be vJ.

Once vJ has been calculated in this way for all J ⊆ Zm with |J| = r, let's consider

y(r − 1) = y + X

|J|=r

(32)

Now y(r − 1) can be decoded by being treated as a received word that was encoded using RM (r − 1, m). This process can be continued until vJ has been found for all J ⊆ Zm with

|J | ≤ r.

Before summarizing this algorithm we make note of the fact that it corrects all error patterns of weight less than |HJ|/2 where |J| ≤ r. However |HJ| = ωH( bfJ) = 2m−|J |.

So all error patterns of weight less than 2m−r−1 _{are corrected and therefore RM(r, m) has}

minimum distance at least 2m−r_{. However, if I ⊆ Z}

m and |I| = r, then bfJ is a codeword in

RM (r, m)and has weight 2m−r.

Algorithm 1 [35] Majority logic decoding of RM(r, m) Let y be a received word.

1. Let i = r and set y(r) = y.

2. For each J ⊆ Zm with |J| = i, calculate y(i) · bfJc_,t for each t ∈ H_J until either 0 or 1

occurs more than 2m−i−1 _{times and let v}

J be 0 or 1 respectively, if both 0 and 1 occur

more than 2m−r−1_{− 1}_{times then ask for retransmission.}

3. If i > 0 then let y(i − 1) = y(i) + P_{J ⊆Z}mvJfbJ where |J| = i. If y(i − 1) has weight

at most 2m−r−1_{− 1}_{then set v}

J = 0for all J ⊆ Zm with |J| ≤ r and stop. Otherwise

replace i with i − 1 and return to step 2. (If i = 0 then vJ has been calculated for all

J ⊆ Zm with |J| ≤ r, so the most likely message has been found.)

1.4.3 First order Reed-Muller codes

The First-Order Reed-Muller code is a subspace RM(1, m) of Bm consisting on ane

func-tions. An ane function can be written as

f (x) = u0+ u1x1+ · · · + umxm, (1.8)

where ui ∈ F2 for i = 0, · · · , m

This code can be used as following : starting from the word (u0, u1, · · · , um) of length

k = m + 1, this word corresponds to the ane function f ∈ RM(1, m) dened by the Equation 1.8. The encoded word is given by bf = (f (u0), · · · , f (um)).

Then, RM(1, m) is a code of dimension k = m + 1 and length 2m_{. The weight of the}

codewords can be easily dened. Indeed, if the ane function f is zero, the weight of the corresponding codeword is 0. If the ane function f is 1, the weight of the corresponding codeword is 2m_.

In all other cases, the set of zeros of f is an ane hyperplane which therefore has 2m−1

points, and the weight of the corresponding word is 2m _{− 2}m−1 _{= 2}m−1_{. The minimum}

distance of the code is d = 2m−1_{. In conclusion, this code can correct t errors with}

t = d − 1 2 = 2 m−1_{− 1} 2 = 2m−2− 1.

(33)

19 1.5. CYCLIC CODES Fast Decoding for RM(1, m)

We briey present a very ecient decoding method for RM(1, m) codes. First we need to introduce the Kronecker product of matrices.

Dene A ⊗ B = [ai,jB]i,j, which corresponds in replacing every ai,j in A by the matrix

ai,jB.

Now we consider a series of matrices dened as: H_mi = I2m−i⊗ H ⊗ I2i−1 for i = 1, 2, · · · , m, where H = " 1 1 1 −1 # .

The decoding algorithm for RM(r, m) is given as follows: Algorithm [57] Fast RM(1, m) Decoding

Suppose y is received and G(1, m) is the generator matrix for RM(1, m) code 1. replace all 0 by −1 in y to obtain y

2. compute y1= yHm1 and yi = yi−1Hmi for i = 2, 3, · · · , m.

3. Find the position j of the largest component (with absolute value) of ym.

Let uj ∈ Fm2 be the binary representation of j (low order digits rst). Then if the jth

component of ym is positive, the presumed message is (1, uj), and if it is negative the

presumed message is (0, uj).

Example 5 Let m = 3, and G(1, 3) be the generator matrix for RM(1, 3).

If y = (1, 0, 1, 0, 1, 0, 1, 1) is received, convert y to y = (1, −1, 1, −1, 1, −1, 1, 1). Compute: y1 = yH31 = (0, 2, 0, 2, 0, 2, 2, 0)

y2 = y1H32 = (0, 4, 0, 0, 2, 2, −2, 2)

y3 = y2H33 = (2, 6, −2, 2, −2, 2, 2, −2)

The largest component of y3 is 6 occurring in position 1, since u1 = (1, 0, 0)and 6 > 0, then

the presumed message is m = (1, 1, 0, 0).

1.5 Cyclic Codes

Cyclic codes are a class of linear error correcting codes that can be eciently encoded and decoded through their polynomials representation, that make them suitable for hardware implementation by the use of linear feedback shift register (LFSR).

(34)

1.5.1 Polynomials Representation

Let C be an [n, k] linear code. Let u and c denote a message and the corresponding codeword in C, respectively. For each codeword c a polynomial c(x) is associated,

c = (c0, c1, · · · , cn−1) 7→ c(x) = c0+ c1x + · · · + cn−1xn−1.

The power of the indeterminant x serves to indicate the relative position of an element ci of

c as a term cixi of c(x) for 0 ≤ i < n.

A linear code C is cyclic if and only if every cyclic shift of a codeword is another codeword that is

c = (c0, c1, · · · , cn−1) ∈ C ⇐⇒ τ (c) = (cn−1, c0, · · · , cn−2) ∈ C

In terms of polynomials, a cyclic shift by one position, denoted τ(c)(x), is accomplished by a multiplication by x modulo (xn_{− 1)}_,

c(x) ∈ C ⇐⇒ τ (c)(x) = xc(x) mod (xn− 1) ∈ C

The generator polynomial

An important property of a cyclic code is that all polynomial codewords c(x) are multiples of a unique polynomial g(x), called the generator polynomial of the code. This polynomial is specied by its roots, which are called the zeros of the code. It can be shown that the generator polynomial g(x) divides (xn_{− 1)} _{(a(x) divides b(x) whenever b(x) = q(x)a(x)).}

Therefore, to nd a generator polynomial, the polynomial (xn_{− 1)}_{must be factored into its}

irreducible factors, φj(x), j = 1, 2, · · · , l

(xn− 1) = φ1(x)φ2(x) · · · φl(x) (1.9)

As a consequence of the above, the polynomial g(x) is given by

g(x) = Y

j∈J ⊂{1,2,··· ,l}

φj(x) (1.10)

Theorem 5 The dimension of a binary cyclic code of length n and a generator polynomial g(x) is given by

k = n − deg(g(x)) (1.11)

where deg(.) denotes the degree of the polynomial.

Theorem 6 Let g(x) = g0+ g1x + · · · + grxr be a generator polynomial of a cyclic code C.

A generator matrix of C is given by

G =       0 · · · 0 gr · · · g1 g0 0 · · · 0 gr · · · g1 g0 0 ... ... ... ... ... ... gr · · · g1 g0 0 · · · 0      

(35)

21 1.5. CYCLIC CODES The parity-check polynomial

Another polynomial h(x), called the parity-check polynomial, can be associated with the parity-check matrix. The generator polynomial and the parity-check polynomial are related by

g(x)h(x) = xn− 1

The parity-check polynomial can be computed from the generator polynomial as h(x) = (xn− 1)/g(x) = h0+ h1x + · · · + hkxk. Then, a parity-check matrix for C is given by using

as rows the binary vectors associated with the rst n − k nonzero cyclic shifts h(j)_{(x) =}

xj_{h(x) mod (x}n_{− 1)}_{, j = 0, 1, · · · , n − k − 1.} H =          h0 h1 · · · hk 0 · · · 0 0 h0 h1 h2 · · · hk 0 · · · 0 ... ... ... ... ... ... ... ... ... 0 · · · 0 h0 h1 h2 · · · hk 0 0 · · · 0 h0 h1 h2 · · · hk         

Example 6 The parity-check polynomial for the Hamming [7, 4, 3] cyclic code, with gener-ator polynomial g(x) = x3_{+ x + 1}_{, is h(x) = (x}7_{− 1)/(x}3_{+ x + 1) = x}4_{+ x}2_{+ x + 1}_{. A}

parity-check matrix for this code is

H =    1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1   .

The dual code of a cyclic code C with generator polynomial g(x) is the cyclic code C⊥

generated by the polynomial h(x). Encoding of binary cyclic codes

Let u(x) = u0+ u1x + · · · + uk−1xk−1 denote a polynomial of degree k −1, whose coecients

ui, i = 0, 1, · · · , k − 1, are the message bits to be encoded. Let c(x) denote the polynomial

codeword in C corresponding to the information polynomial u(x).

The encoding of a binary cyclic code can be expressed in term of its generator polynomial g(x) as follows:

c(x) = u(x)g(x) (1.12)

which is a linear combination of the k rows of G.

We know that a codeword c(x) must satisfy c(x)h(x) = 0, then a codeword can be obtained by solving the n − k equations of the matrix equation HcT _{= 0}_.

(36)

Decoding cyclic codes

Let r(x) = c(x) + e(x), where e(x) is the error polynomial associated with an error vector produced after transmission over a BSC channel. Then the syndrome polynomial is dened as

s(x) = r(x) mod g(x) = e(x) mod g(x). (1.13)

The syndrome polynomial s(x) is used to determine the error polynomial e(x). The decoding problem amounts to nding the error polynomial e(x) from the syndrome polynomial s(x), which is the basis of a syndrome decoding.

1.6 BCH codes

In this section, we discuss a class of important and widely used cyclic codes that can correct multiple errors, developed by R. C. Bose and D. K. Ray-Chaudhuri [15] and independently by A. Hocquenghem [39], known as Bose Chaudhuri Hocquenghem (BCH) codes. The BCH codes are presented as a family of cyclic codes, which gives them an algebraic structure that is useful to simplify their encoding and decoding procedures.

To describe a BCH codes, we look at the structure of the nite elds Fq (see [41, 49]).

Denition 12 (Primitive element). Let Fq be a nite eld and α ∈ Fq is called a primitive

element of Fq if every element of F∗q can be written as a power of α.

Theorem 7 Every nite eld has at least one primitive element.

Denition 13 (Minimal Polynomial) Given an element α in a eld Fqm, the monic

poly-nomial m(x) in Fq[x] of the lowest degree with α as a root is called the minimal polynomial

of α.

Theorem 8 If f(x) ∈ Fp[x]has α as a root, then f(x) is divisible by the minimal polynomial

of α.

Proof : If f(α) = 0, then expressing f(x) = q(x) m(x) + r(x) with deg r(x) < deg m(x), we see that r(α) = 0. By the minimality of deg m(x), we see that r(x) is identically zero. Denition 14 Let α be a primitive element of Fqm. An [n, k] BCH code of designed

dis-tance δ is a cyclic code of length n generated by a polynomial g(x) in Fq[x] of degree n − k

that has roots at αb_{, α}b+1_{, · · · , α}b+δ−2_{, where b is a xed integer.}

For any positive integer i, let φi(x) be the minimal polynomial of αi. The generator

polynomial of the BCH code is dened as the least common multiple

(37)

23 1.6. BCH CODES Theorem 9 Let C be a binary BCH code of length n, with designed minimum distance δ = 2tδ+ 1, where tδ is the correcting capability of C. Then C has a dimension k ≥ n − mtδ

and a minimum distance dmin ≥ δ.

Example 7 Consider GF (24₎_{, p(x) = x}4_{+ x + 1}_{, with δ = 7 and b = 1. Then}

g(x) = lcm{φ1(x), φ3(x), φ5(x)}

= (x4+ x + l)(x4+ x3+ x2+ x + l)(x2+ x + l) = x10+ x8+ x5+ x4x2+ x + 1

generates a triple-error-correcting binary [15, 5, 7] BCH code.

Remark 2 A polynomial c(x) belongs to an [n, k] BCH code of designed distance δ, if and only if

c(αi) = 0, b ≤ i < b + δ − 1. (1.15)

It follows that, a codeword c = (c0, c1, · · · , cn−1)satises the following set of δ −1 equations,

is expressed in matrix form:

(c0, c1, c2, · · · , cn−1)          1 1 · · · 1 αb αb+1 · · · αb+δ−2 (αb₎2 _(αb+1₎2 _{· · ·} _(αb+δ−2₎2 ... ... ... ... (αb)n−1 (αb+1)n−1 · · · (αb+δ−2)n−1          = 0

Consequently, a quasi parity-check matrix of a binary cyclic BCH code is given by

H =       1 αb (αb)2 · · · (αb)n−1 1 αb+1 (αb+1)2 · · · (αb+1)n−1 ... ... ... ... ... 1 αb+δ−2 _(αb+δ−2₎2 _{· · ·} _(αb+δ−2₎n−1      

This matrix H has the characteristic that every (δ − 1) × (δ − 1) submatrix (formed by an arbitrary set of δ − 1 columns of H) is a Vandermonde matrix. Therefore, any δ − 1 columns of H are linearly independent, from which it follows that the minimum distance of the code is dmin ≥ δ.

1.6.1 Decoding of BCH codes

We now describe the decoding procedure for BCH codes: suppose that r(x) is received rather than c(x) and that t (≤ tδ) errors have occurred. Then, the error polynomial e(x) =

r(x) − c(x)can be written as e(x) = ej1x j1 _{+ e} j2x j2_{+ · · · + e} jtx jt_,

(38)

The sets {ej1, ej2, · · · , ejt}, and {α

j1_{, α}j2_{, · · · , α}jt_} are known as the error values and error

positions, respectively, where ej ∈ {0, 1}and α ∈ F2m.

The syndromes are the evaluation of r(x) at each of the zeros of the code, and are equivalent to S = rHT_: S1 = r(αb) = ej1α bj1_{+ · · · + e} jtα bjt S2 = r(αb+1) = ej1α (b+1)j1_{+ · · · + e} jtα (b+1)jt ... ... Sδ−1 = r(αb+δ−2) = ej1α (b+δ−2)j1_{+ · · · + e} jtα (b+δ−2)jt (1.16)

Notice that from 1.16 the syndromes satisfy: Sk = r(αb+k−1) = t X i=1 eji(α b+k−1₎ji ₌ t X i=1 eji(α ji₎b+k−1_, (1.17)

for 1 ≤ k ≤ δ − 1. To simplify the notation, for 1 ≤ i ≤ t, let Ei = eji and Xi = α

ji. With

this notation 1.16 becomes Sk =

t

X

i=1

EiX_ib+k−1, f or 1 ≤ k ≤ δ − 1 (1.18)

which in turn leads to the system of equations:

S1 = E1X1b+ E2X2b+ · · · + EtXtb, S2 = E1X1b+1+ E2X2b+1+ · · · + EtXtb+1, S3 = E1X₁b+2+ E2X₂b+2+ · · · + EtXtb+2, ... = ... Sδ−1 = E1X1b+δ−2+ E2X2b+δ−2+ · · · + EtXtb+δ−2. (1.19)

This system is nonlinear in the Xis with unknown coecients Ei. The strategy is to use

1.18 to set up a linear system, involving new variables σ1, σ2, · · · , σt, that will lead directly

to the error position numbers.

Let the error locator polynomial be dened as σ(x) = t Y i=1 (αji_{x − 1)} ₌ _{1 + σ} 1x + σ2x2+ · · · + σtxt (1.20)

with roots equal to the inverses of the error positions and thus

σ(X_i−1) = 1 + σ1Xi−1+ σ2Xi−1+ · · · + σtXi−t = 0, f or 1 ≤ j ≤ t. (1.21)

Multiplying by EiXib+k+t−1produces EiXib+k+t−1+ σ1EiXib+k+t−2+ · · · + σtEiXib+k−1 = 0,

for all 1 ≤ k ≤ t.

Summing over i for 1 ≤ j ≤ t yields

t X i=1 EiXib+k+t−1+ σ1 t X i=1 EiXib+k+t−2+ · · · + σt t X i=1 EiXib+k−1 = 0, (1.22)

(39)

25 1.6. BCH CODES these summations are the syndromes obtained in 1.18.

Because t ≤ tδ, Equation 1.22 becomes

Sk+t+ σ1Sk+t−1+ · · · + σtSk = 0, f or 1 ≤ k ≤ t. (1.23)

Then the following relation between the coecients of σ(x) and the syndromes holds       St+1 St+2 ... S2t       =       S1 S2 · · · St S2 S3 · · · St+1 ... ... ... ... St St+1 · · · S2t−1             σt σt−1 ... σ1       (1.24)

Solving the key equation (1.24) constitutes the most computationally intensive operation in decoding BCH codes. Common methods to solve the key equation are:

• The Peterson-Gorenstein-Zierler Decoding Algorithm: Proposed rst by Peterson [60], this method directly nds the coecients of σ(x), by solving the key equation as a set of linear equations.

• Berlekamp-Massey Algorithm (BMA): The BMA was invented by Berlekamp [7] and Massey [54]. This is a computationally ecient method to solve the key equation, in terms of the number of operations in GF (2m₎_{. The BMA is a popular choice to}

simulate or implement BCH decoder in software.

• Euclidean algorithm (EA): This method to solve the key equation, in polynomial form, was introduced in [77] and further studied in [53]. Due to its regular structure, the EA is widely used in hardware implementations of BCH and RS decoders.

The Peterson-Gorenstein-Zierler Decoding Algorithm

Once the key equation (1.24) has been solved, σ(x) can been determined. However, deter-mining σ(x) is complicated by the fact that we do not know the actual number of errors t, and hence we do not know the size of the system involved. Therefore, a guess has to be made as to the actual number of errors t, in the received word.

The decoder assumes that the maximum number of errors has occurred, tmax = tδ, and

compute the determinant ∆i, for i = tmax = tδ,

∆i = det       S1 S2 · · · Si S2 S3 · · · Si+1 ... ... ... ... Si Si+1 · · · S2i−1      

Then checks if ∆i = 0, so a smaller number of errors must have occurred. The value of i is

(40)

Otherwise, ∆i 6= 0, the inverse of the syndrome matrix is computed and the values of

σ1, σ2, · · · , σtare found, where t = i. In the event that ∆i= 0, i = 1, 2, · · · , tδ , decoding is

unsuccessful and an uncorrectable error pattern has been detected. Algorithm[57] The P GZ Decoding Algorithm for BCH codes 1. Compute the syndromes Sk= r(αb+k−1) for 1 ≤ k ≤ δ − 1,

2. In the order i = tδ, i = tδ− 1, · · · decide if ∆i= 0, and stopping at the rst value of

iwhere ∆i6= 0. Set t = i and solve 1.16 to determine σ(x).

3. Find the roots of σ(x) by computing σ(αj₎ _{for 1 ≤ j < n. Invert the roots to get the}

error positions Xi.

4. Solve the rst t equations of 1.19 to obtain the error values Ei.

Example 8 Let C be the triple-error correcting [15, 5] BCH code of designed distance δ = 7, which has a generator polynomial g(x) = x10_{+ x}8 _{+ x}5 _{+ x}4 _{+ x}2 _{+ x + 1}_{. Let r(x) =}

x + x4+ x5+ x7+ x9+ x12 be the polynomial associated with the vector r = c + e received after transmission of codeword c over a BSC channel. The syndromes are: S1 = α14,

S2 = α13, S3= α14, S4 = α11, S5 = 1 and S6= α13.

To found the error locator polynomial, rst, assume that i = tδ = 3 errors occurred.

Then the rst nonzero determinant ∆2 6= 0 is computed as follows:

∆2 = det

α14 α13 α13 α14

!

= α13+ α11

Therefore ∆2 6= 0 and two errors are assumed to have occurred. Substituting the syndromes

into the key equation (1.24),

α14 α13 α13 α14 ! σ2 σ1 ! = α 14 α11 ! (1.25) Note that M2 = α14 α13 α13 α14 !

Then the solution to (1.25) is found to be: σ2 σ1 ! = M₂−1 α14 α11 ! = α6 α14 !

It follows that σ(x) = 1 + α14_{x + α}6_x2_{. Step(3) yields the roots α}5 _{and α}4 _{of σ(x) and}