Performance analysis of modulation with diversity - A combinatorial approach II : Bijective methods

(1)

HAL Id: hal-00018552

https://hal.archives-ouvertes.fr/hal-00018552

Submitted on 7 Feb 2006

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Performance analysis of modulation with diversity - A combinatorial approach II : Bijective methods

Daniel Krob, Ekaterina A. Vassilieva

To cite this version:

Daniel Krob, Ekaterina A. Vassilieva. Performance analysis of modulation with diversity - A combinatorial approach II : Bijective methods. Discrete Applied Mathematics, Elsevier, 2005, 145, (3), pp.403-421. �hal-00018552�

(2)

Performance evaluation of demodulation with diversity – A combinatorial approach II: Bijective methods

D. Krob ^∗ , E.A. Vassilieva ^†

Abstract

This paper is devoted to the presentation of a combinatorial approach for analyzing the performance of a generic family of demodulation methods used in mobile telecommunications. We show that a fundamental formula in this context is in fact highly connected with a slight modifi- cation of a very classical bijection of Knuth between pairs of Young tableaux of conjugate shapes and {0,1}-matrices. These considerations allowed us to obtain the first explicit expressions for several important specializations of the performance evaluation formula that we studied.

1 Introduction

Modulating a numeric signal means to transform it into a wave form. Modulation is therefore a technique of main interest in a number of engineering domains such as computer networks, mobile communications, satellite transmissions, . . . Due to their practical importance, modulation methods were of course widely studied in signal processing. The classical Proakis textbook devotes for instance a full chapter to this subject (cf. Chapter 5 of [13]). One should also point out that one of the most important problems in this area is to be able to evaluate the performance characteristics of the optimum receivers associated with a given modulation method, which reduces to the computation of various probabilities of errors (see again Chapter 5 of [13]).

Among the different families of modulation protocols used in practice, an important class consists in methods where the modulation reference (i.e. a fixed digital sequence) is also mo- dulated and transmitted. In this kind of situation, the demodulation decision needs to take into account several noisy informations (the transmitted signal, the transmitted reference, but also copies of these two signals). It turns out that the probability of errors appearing in such contexts leads very often to the computation of the following type of probability:

P(U < V) =P U = XN i=1

|u_i|² < V = XN i=1

|v_i|²

!

, (1)

where the u_i and v_i’s stand for independent centered complex Gaussian random variables with variances denoted byE[|u_i|²] =χ_i and E[|v_i|²] =δ_i for everyi∈[1, N] (see also Section 3.2).

∗Corresponding author: LIX (CNRS) - Ecole Polytechnique - Route de Saclay - 91128 Palaiseau Cedex – France –e-mail: dk@lix.polytechnique.fr

† LIX (CNRS) - Ecole Polytechnique - Route de Saclay - 91128 Palaiseau Cedex – France – e-mail:

katya@lix.polytechnique.fr

(3)

The problem of computing explicitely this last probability was hence studied by several researchers from signal processing (cf. [2, 9, 13, 16]). The most interesting result in this direction was obtained by Barrett (cf. [2]) who proved that the probability defined by (1) is equal to

P(U < V) = XN k=1



 Y

j6=k

1 1−δ_k⁻¹δ_j

YN j=1

1 1 +δ_k⁻¹χ_j



 . (2)

This last formula can also be described in a purely combinatorial way, using Young tableaux (cf. Section 3.2). This new approach has already led to the first, both algorithmically efficient and numerically stable, practical method for computing the probability P(U < V) (cf. Section 3.2 or [5, 6]). In this paper, we continue the combinatorial study of Barrett’s formula by connecting it with a very classical bijection of Knuth (cf. Section A.4.3 of [7] or [10]) between pairs of Young tableaux of conjugate shapes and {0,1}-matrices. These considerations allowed us in particular to get the first explicit expressions for several specializations of formula (2) (cf.

Section 6).

2 Background

2.1 Partitions and Young tableaux

A partition is a finite nondecreasing sequence λ = (λ₁, λ₂, . . . , λ_m) of positive integers. The number m of elements ofλ is called thelength of the partitionλ. One can represent each such partitionλby aFerrers diagram ofshape λ, that is to say by a diagram ofλ₁+. . .+λ_m boxes whose i-th row contains exactly λ_i boxes for every 1 ≤i≤m. The Ferrers diagram associated with the partition λ= (2,2,4) is for instance given below.

Theconjugatepartitionλ˜of a given partitionλis the partition obtained by reading the heights of the columns of the Ferrers diagram associated with λ. For instance, for the partition λ = (2,2,4) of the above figure, we have λ˜= (1,1,3,3).

Whenλis a partition whose Ferrers diagram is contained into the square represented by the partition N^N = (N, . . . , N) with N rows of length N, one can also define the complementary partitionλofλwhich is the conjugate of the partitionνwhose Ferrers diagram is the complement (read from bottom to top) of the Ferrers diagram of λ in the square (N^N). Note that this definition is relative to a given size N and that the square does not have to be the smallest one containing λ. For instance, for N = 6 and λ= (1,1,2,3), we have ν = (3,4,5,5,6,6) and λ= (2,4,5,6,6,6) (see Figure 1).

Let A be a totally ordered alphabet. A tabloid of shape λ over A is a filling of the boxes of a Ferrers diagram of shape λwith letters of A. A tabloid is called a Young tableau when its rows and its columns consist respectively of non decreasing and strictly increasing sequences of letters of A. One can see below a Young tableau of shape (2,2,4) overA={a₁< . . . < a₅}.

a₃ a₅ a₂ a₂ a₁ a₁ a₁ a₄

(4)

•

• •

• • •

Figure 1: Two complementary partitions : λ= (1,1,2,3) and λ= (2,4,5,6,6,6).

One associates with any Young tableau T overA the monomial A^T which is the product of all letters ofAthat occur in the different boxes ofT. One has for instanceA^T =a³₁a²₂a₃a₄a₅forT the Young tableau of the last example. TheSchur functions_λ(A) associated with the partition λis then defined as the sum of all monomialsA^T forT running over all Young tableaux of shape λ. We recall that the Schur functions are symmetric polynomials that form a linear basis of the algebra of symmetric polynomials over A(cf. Section 1.3 of [12]).

2.2 Knuth’s bijection

Knuth’s bijection is a famous one-to-one correspondence between {0,1}-matrices and pairs of Young tableaux of conjugate shapes (cf. [10]). It is based on the column insertion process which is a classical combinatorial construction that we present now. LetA be a totally ordered alphabet. The fundamental step of the column insertion process associates with a letter a∈A and a Young tableau T overA a new Young tableauT(a) overA defined as follows.

1. If a is strictly larger than all the entries of the first column of T, the tableau T(a) is obtained by puttingain a new box at the top of the first column ofT.

2. Otherwise one can consider the smallest entry b of the first column ofT which is greater than or equal toa. The tableauT(a) is then obtained by replacingbbyaand by applying recursively our insertion scheme, starting now by trying to insertb in the second column of T. Our process continues until a replaced entry can go at the top of the next column or until it becomes the only entry of a new column.

One can easily check thatT(a) is always a Young tableau. Moreover our process can be reverted if one knows which new box it created. Let noww=a₁. . . a_N be a word overA. The result of the column insertion process applied to w is the Young tableau obtained by column inserting successively a₁, . . . , a_N as described above, starting from the empty Young tableau.

Note 2.1 The Young tableau which is obtained by applying the column insertion process to a word w = a₁. . . a_N over A is the same as the tableau obtained by applying the row insertion process (i.e. Schensted’s algorithm) to its mirror image w˜=a_N. . . a₁ (see [7] for more details).

We are now in the position to present Knuth’s construction. LetM be a matrix from the set MN×N({0,1}) of square {0,1}-matrices of orderN. Knuth’s bijection associates withM a pair (P, Q) of Young tableaux with conjugate shapes over the alphabet [1, N] as described below.

1. Construct the 2-row arrayA_N which results by listing theN² pairs (i, j) of [1, N]×[1, N] in lexicographic order, i.e.

A_N = 1 . . . 1 2 . . . 2 . . . N . . . N 1 . . . N 1 . . . N . . . 1 . . . N

! .

(5)

2. Take in this array all the entries corresponding to the 1’s of M in order to get an array A(M) = u₁ u₂ . . . u_r

v₁ v₂ . . . v_r

! .

3. Form the wordw₁(M) =v₁. . . v_robtained by reading from left to right the bottom entries (the entries of the second row) ofA(M). The column insertion process applied tow₁(M) gives the Young tableauP.

4. Form finally the second Young tableauQby placing for every i∈[1, r] thei-th elementu_i of the first row of A(M) in the box which is conjugate to the i-th box created during the column insertion process that led toP.

By reversing the steps of the described construction, we can recover the arrayA(M) (and hence our matrixM) from the pair (P, Q). We find the box in whichQ has the largest entry; if there are several equal entries, the box that is farthest to the right is selected. Then we perform the reverse column insertion to P starting with the conjugate of the selected box and remove the selected box from Q. We obtain a new pair of Young tableaux with conjugate shapes and perform the same procedure up to the moment when we get two empty Young tableaux.

Example 2.2 Let us consider the matrix

M =





0 0 1 1 0 0 0 1 1



 .

Then the arrays A3 and A(M) are respectively equal to

A3 = 1 1 1 2 2 2 3 3 3

1 2 3 1 2 3 1 2 3

!

and A(M) = 1 2 3 3 3 1 2 3

!

where in A3 we boxed the entries corresponding to the 1’s of M. Thus w₁(M) = (3,1,2,3).

Knuth’s bijection associates with M the following pair of Young tableaux of conjugate shapes:

(P, Q) =



 3 2 1 3 ,

2 1 3 3



 .

We now present a variant of Knuth’s bijection that we will need in the sequel (see also Section A.4.3 of [7]). LetM be again a matrix ofMN×N({0,1}). One can associate withM a new pair (R, S) of Young tableaux with conjugate shapes over the alphabet [1, N] which is constructed as follows.

1. Construct first the 2-row array A^eN which is equal to the sequence of theN² pairs (i, j) of [1, N]×[1, N] taken in the following order:

AeN = N . . . N . . . 2 . . . 2 1 . . . 1 1 . . . N . . . 1 . . . N 1 . . . N

! .

2. Take in this array all the entries corresponding to the 1’s of M. We get an array A^e(M).

(6)

3. Form the wordw_e₁(M) obtained by reading from left to right the bottom entries ofA^e(M).

The column insertion process applied to w_e₁(M) gives the Young tableauR.

4. Form finally the second Young tableau S from R and A^e(M) using the conjugate sliding process (see Section A.4.3 of [7] for more details) applied to w_e₁(M). Now we will briefly describe this process. To this purpose, let us first set

Ae(M) = u˜₁ . . . u˜_r

˜

v₁ . . . v˜_r

! .

To construct the second Young tableau S, we apply the following procedure.

• Start with one single box containing ˜u₁.

• For eachiin [2, r], apply successively the following rules.

– Add an empty box conjugate to thei-th box that appears during the construction of the Young tableauR and slide there the greater of its two neighbours to the left or below. If the two neighbours have the same entry, the one below is chosen.

If there is only one neighbour, it is chosen by abuse of terminology. This creates a new empty box. This sliding process continues until the empty box is the first one of the first column.

– Put in this box, the following entry ˜u_i.

At the end of this procedure, we get the Young tableauS. The tableauxR andSare then of conjugate shape.

The following symmetry result connects then the two previous constructions (cf. Section A.4.3 of [7] for more details).

Theorem 2.3 (Knuth; [10]) Let M be a matrix of M_N_×N({0,1}) and let ^tM be the transpose of M. Let (P, Q) be the result of Knuth’s bijection applied to M and let (R, S) be the result of the process described above applied to ^tM. Then one has (P, Q) = (S, R).

Example 2.4 Take again the matrix M of Example 2.2. The transpose of M is

tM =





0 1 0 0 0 1 1 0 1



 .

Then the arrays A^e3 and A^e(^tM) are respectively given by

Ae₃ = 3 3 3 2 2 2 1 1 1

1 2 3 1 2 3 1 2 3

!

and A^e(^tM) = 3 3 2 1 1 3 3 2

!

where inA^e3 we boxed the entries coresponding to the1’s of ^tM. Thusw_e₁(^tM) = (1,3,3,2). The above construction associates with ^tM the following pair (R, S) of Young tableaux:

(R, S) =



 2

1 3 3 , 3 2 1 3



= (Q, P) .

(7)

2.3 Plactic relations

The column insertion process can also be described algebraically by the plactic formalism de- veloped by Lascoux and Sch¨utzenberger (cf. [11]) that we will now present. Let A be a totally ordered alphabet. The plactic monoid is the monoid constructed over A and subject to the following relations (discovered by Knuth (cf. [10])):





aba≡baa, bba≡bab, for everya < b∈A , acb≡cab, bca≡bac, for everya < b < c∈A .

Two words over A are identified under the plactic relations if and only if the Young tableaux obtained by applying the column insertion process to their mirror images are equal (cf. [7, 11]).

We now present an important property of the plactic monoid that we will use in the sequel.

LetT be a Young tableau overA. One can associate withT a wordw(T) overAby reading the columns of T from top to bottom and left to right. The words associated with Young tableaux in such a way are calledtableau words. For instance the tableau word associated with the Young tableau T at the end of Section 2.1 is w(T) = a₃a₂a₁a₅a₂a₁a₁a₄. Note that applying the column insertion to the mirror image of a tableau word w(T) yields the tableau T. Observe also (see Section 2.1 of [7] or [11]) that a word over A is equivalent with respect to the plactic relations to a unique tableau word (which is therefore associated with the Young tableau given by the column insertion process applied to the mirror image of w).

3 Performance analysis of demodulation protocols

3.1 Demodulation with diversity

Our initial motivation for studying Barrett’s formula came from mobile communications. The probabilityP(U < V) given by formula (1) appears indeed naturally in the performance analysis of demodulation methods based on diversity which are standard in such a context. In order to motivate more strongly our paper, we first present in details this last situation.

We consider a model where one transmits an informationb∈ {−1,+1}on a noisy channel¹. A reference r = 1 is also sent on the noisy channel at the same time as b. We assume that we receive N pairs (x_i(b), r_i)_1≤i≤N ∈(^C×^C)^N of data (thex_i(b)’s) and references (ther_i’s) ² that have the following form

( x_i(b) = a_ib+ν_i for every 1≤i≤N, r_i = a_i√

β_i+ν_i⁰ for every 1≤i≤N,

wherea_i ∈^Cis a complex number that models the channel fading associated withx_i(b)³, where β_i ∈^R⁺ is a positive real number that represents the excess of signal to noise ratio (SNR) which is available for the reference r_i ⁴ and where ν_i ∈^C and ν_i⁰ ∈^C denote finally two independent complex white Gaussian noises. We also assume that every a_i is a complex random variable distributed according to a centered Gaussian density of varianceα_i for everyi∈[1, N].

1 This situation corresponds to Binary Phase Shift Keying (BPSK).

2 This situation corresponds to spatial diversity, i.e. when more than one antenna is available, but also to multipath reflexion contexts. These two types of situations typically occur in mobile communications.

3Fading is typically the result of the absorption of the signal by buildings. Its complex nature comes from the fact that it models both an attenuation (its modulus) and a dephasing (its argument).

4 This numberβi is usually greater or equal to 1. In practice however, one often takesβi= 1.

(8)

According to these assumptions, all observables of our model, i.e. the pairs (x_i(b), r_i)_1≤i≤N, are complex Gaussian random variables. We finally also assume that these N observables are N independent random variables which have their image in ^C². Under these hypotheses it is proved in [4] that

log

P(b= +1|X) P(b=−1|X)

= XN i=1

4α_i√ β_i

1 +α_i(βi+ 1)(x_i(b)|r_i) (3) with X = (x_i(b), r_i)_1≤i≤N and where (?|?) denotes the Hermitian scalar product. The demodulation decision is based on the associated Bayesian criterium. One indeed decides thatb was equal to 1 (resp. to−1) when the right hand side of Formula (3) is positive (resp. negative).

Intuitively this means that one decides that the valueb= 1 was sent when the x_i(b)’s are more or less globally in the same direction than ther_i’s. Figure 2 illustrates the caseN = 1 and one can see that a noisy reference r has a positive (resp. negative) Hermitian scalar product with a noisy information x when xcorresponds to a small pertubation of 1 (resp. −1).

6

'

&

$

%

'

&

$

% '

&

$

% :

XXXXXXXXXXXz '

&

$

%

x(1)

x(−1) r

−1 1 √

β -

Figure 2: Two possible noisy bitsx(1) andx(−1) and a noisy referencer in the case N = 1.

The bit error probability (BER) of our model is the probability that the value b = 1 was decoded in−1, i.e. the probability that one had

XN i=1

4α_i√ β_i

1 +α_i(β_i+ 1)(xi(1)|r_i) <0 .

Using the parallelogram identity, it is now easy to rewrite this last probability as P(

XN i=1

|u_i|²− XN j=1

|v_i|²<0 )

whereu_i and v_i denote for every i∈[1, N] the two variables defined by setting u_i = α_i√

β_i 1 +α_i(β_i+ 1)

!1/2

(x_i(1)+r_i) and v_i= α_i√ β_i 1 +α_i(β_i+ 1)

!1/2

(x_i(1)−r_i) . Our various hypotheses imply then immediately that the u_i’s and the v_i’s are independent complex Gaussian random variables. Hence the performance analysis of our model relies exactly on Barrett’s Formula (2) as already indicated in the introduction of our paper (cf. Formula (1)).

(9)

It is also interesting to point out the explicit relation between the values ofχ_i andδ_i appearing in Barrett’s formula and theα_i and β_i which is the following:

χ_i = 2 α_i(β_i+ 1) 1 +α_i(β_i+ 1)

p∆_i+α_i^pβ_i and δ_i = 2 α_i(β_i+ 1) 1 +α_i(β_i+ 1)

p∆_i−α_i^pβ_i

with ∆_i = (α_i+ 1)(α_iβ_i+ 1).

3.2 Barrett’s formula

As we saw in the last section, Barrett’s formula is connected with the performance analysis of demodulation methods based on diversity. More generally, the performance analysis of many other practical digital transmission systems is based on the computation of the probability that a given Hermitian quadratic formq in complex centered Gaussian variables is negative. Numerous examples of such situations can be found for instance in Proakis’s standard textbook (cf. [13]).

The problem of computing such a probability was therefore addressed by several researchers from signal processing. A first formula for this probability was derived by Turin (cf. [16]) and used later by Barrett (cf. [2]) who expressed it as a rational function of the eigenvalues of the covariance matrix associated with q. Formula (2) appears then as a special case of this more general result of Barrett. Alternate methods based either on contour integration or on algebraic manipulations (as in [9] or in annex B of [13]) provide other approaches that involve numerical quadrature of trigonometric functions.

However, all these methods lead to algorithms that are not numerically stable due to the presence of artificial singularities such as the situationδ_i=δ_jin Barrett’s formula (2)⁵. The first efficient and stable method for computing the probabilityP(U < V) defined by (1) was obtained by Dornstetter, Krob and Thibon (cf. [5]) using techniques from the theory of symmetric functions (cf. [6]). For the sake of completeness, we recall below their algorithm.

• Step 1. Consider the two polynomials X(z) and ∆(z) of^R[z] defined by setting X(z) =

YN i=1

(1−χ_iz) and ∆(z) = YN i=1

(1 +δ_iz) .

• Step 2. Compute the unique polynomial π(z) of^R[z] of degreed(π)≤N−1 such that π(z)X(z) +µ(z) ∆(z) = 1

whereµ(z) stands for some polynomial of ^R[z] of degree d(µ)≤N−1.

• Step 3. Evaluate π(0) =P(U < V) .

The efficiency and the numerical stability of this algorithm come from the fact that the second step of the above method can be realized by the classical generalized Euclidean algorithm which has the two above mentioned properties.

5 These last singularities typically create numerical problems in the context of demodulation with diversity described in Section 3.1 where one must deal both withχi’s andδi’s that are very close to each other.

(10)

3.3 The combinatorial version of Barrett’s formula

Using Barrett’s formula, it was proved (cf. [6]) that Formula (1) reduces to P(U < V) = F(χ, δ)

Y

1≤i,j≤N

(χ_i+δ_j) (4)

whereF(χ, δ) denotes the symmetric (with respect to theχ_i’s and theδ_j’s) polynomial F(χ, δ) = ^X

λ⊆(N^N−1)

s_(λ,N)({χ₁, . . . , χ_N}) s_(λ,N₎({δ₁, . . . , δ_N}) , (5)

(λ, N) representing the complement of the partition (λ, N) in the square N^N (cf. [6]). Recall thats_λ(X) denote the Schur function associated with the partitionλover the alphabetX which is equal by definition to the sum of all monomials X^T, defined as the product of all variables of X involved in T, forT running over all possible Young tableaux of shapeλ.

From this combinatorial interpretation of a Schur function, it follows that the monomials involved in the right hand side of equation (5) are exactly the monomials obtained by taking the product of the elements of all square tableaux of shape (N^N) consisting of two Young tableaux of complementary shapes (cf. Figure 1 of Section 2.1) that respect the two constraints:

• Condition B1: the first Young tableau is only filled by variables that belong to the ordered alphabet δ={δ₁ < . . . < δ_N}and the length of its first row is equal to N,

• Condition B2: the second Young tableau is only filled by variables that belong to the ordered alphabet χ={χ₁ < . . . < χ_N}.

A typical example of such a combinatorial structure is given in Figure 3. The first tableau is written there in the usual way. On the other hand, the second tableau is organized differently:

its rows (resp. its columns) are placed from top to bottom (resp. from right to left) in the space corresponding to the complement of the first tableau within the square (N^N). Note finally that the tableaux formed out in such a way are examples of the so-called (k, l)−semi-standard tableaux in the sense of Remmel (see [14] or [15]).

χ₆ χ₅χ₄ χ₃ χ₂χ₁ δ₅ χ₆χ₅ χ₄ χ₂χ₁ δ₄ δ₅ δ₆ χ₄ χ₃χ₂ δ₃ δ₃ δ₅ χ₅ χ₄χ₃ δ₂ δ₂ δ₃ δ₄ χ₄χ₃ δ₁ δ₁ δ₂ δ₂ δ₂ δ₃

Figure 3: A typical example of complementary fillings of a square tableau.

Example 3.1 For N = 2, Barrett’s formula reduces to

P(U < V) = χ₁χ₂(δ²₁+δ₁δ₂+δ₂²) + (χ₁+χ₂) (δ₁²δ₂+δ₁δ₂²) +δ²₁δ₂² (χ1+δ₁) (χ1+δ₂) (χ2+δ₁) (χ2+δ₂)

(11)

and one can check that the eight monomials occurring in the numerator of this last expression are exactly the products of the entries of the following eight combinatorial structures:

χ₁ χ₂ δ₁ δ₁ ,

χ₁ χ₂ δ₁ δ₂ ,

χ₁χ₂ δ₂ δ₂ ,

δ₂ χ₁ δ₁ δ₁ ,

δ₂ χ₂ δ₁ δ₁ ,

δ₂ χ₁ δ₁ δ₂ ,

δ₂ χ₂ δ₁ δ₂ ,

δ₂ δ₂ δ₁ δ₁ .

The complexity of Formula (4) is O(N²γ_N) where γ_N denotes the number of monomials involved in its numerator or equivalently the number of square tableaux of shape (N^N) filled as in the typical example of Figure 3. Unfortunatelyγ_N = 2^N²⁻¹ (see below) from which it follows that Formula (4) is impracticable when N grows. This combinatorial formula is however not useless since it leads to efficient expressions for several interesting specializations of Barrett’s formula (cf. Section 6). Formula (4) can also be reformulated in terms of the algorithm given at the end of Section 3.2, which is both practically efficient (its complexity is quadratic as Barrett’s formula) and numerically stable as already stated (cf. [5, 6]).

Proposition 3.2 The numberγ_N of square tableaux of shape(N^N)filled by two complementary Young tableaux satisfying conditions B1 and B2is equal to γ_N = 2^N²⁻¹.

Proof – Let us denote by P(t) the polynomial of ^N[t] that results from the substitution in the numerator F(χ, δ) (of the righthand side of (4)) of χ_i and δ_i by tⁱ for every i∈[1, N]. Then a combination of Formulas (2) and (4) yields:

P(t) = ^Y

1≤i,j≤N

tⁱ+t^j



 XN k=1



 Y

1≤j6=k≤N

1 1−t^j−k

YN j=1

1 1 +t^j−k







 . (6) Note now that the special casez= 0 in the following partial fraction expansion

1 YN

i=1

(1−tⁱz) YN i=1

(1 +tⁱz)

= XN k=1

1 1−t^kz



 YN

j=1 j6=k

1 1−t^j−k

YN j=1

1 1 +t^j−k



+ XN k=1

1 1 +t^kz



 YN j=1

1 1 +t^j−k

YN

j=1 j6=k

1 1−t^j−k





leads immediately to the identity XN k=1



 Y

1≤j6=k≤N

1 1−t^j−k

YN j=1

1 1 +t^j−k



= 1

2 (7)

from which we deduce that one has P(t) = 1

2



 Y

1≤i,j≤N

(tⁱ+t^j)



 .

One can now immediately conclude thatγ_N =P(1) = 2^N²⁻¹ which completes our proof.

We will add finally that the proof of Proposition 3.2 is purely analytic. A bijective proof that explains better this result is now coming (see Section 5).

(12)

4 Column words and their complements

This section is devoted to the presentation of several combinatorial results of independent interest concerning column words that will be used in the sequel in our paper.

4.1 Column words

Let A be a totally ordered alphabet. We recall that a column (of length k) over A is just a Young tableau of shape 1^k = (1, . . . ,1) over A. If P = {p₁ < . . . < p_r} is a subset of A, we denote by [P] or by [p_r, . . . , p₁] the unique column of length r filled with all letters of P. We also denote byC(A) ={[P], P ⊂A}the set of all columns overA(including the empty column denoted by [ ]). A word over the alphabetC(A) is then said to be acolumn word.

It is important to note that one can associate with every Young tableauT overA a column word [T] which is just the concatenation of the columns ofT (considered now as letters ofC(A)) read from left to right. The column word associated with the Young tableau given at the end of Section 2.1 is for instance equal to [a₃a₂a₁] [a₅a₂a₁] [a₁] [a₄].

The column words associated with Young tableaux can be characterized using the partial order on the alphabet C(A) which is defined as follows. Let P = {p₁ < . . . < p_r} and Q= {q₁ < . . . < q_s} be two subsets of A. Then one says that [P][Q] if one has s≤r and p_i ≤q_i for every 1≤i≤s. In other words, the column [P] is less or equal to the column [Q] if and only if one gets a Young tableau when putting [Q] at the right of [P]. It is then easy to see that each column word associated with a Young tableau is a non decreasing column word (with respect to) and that conversely each non decreasing column word encodes a Young tableau ⁶. Example 4.1 Let A={a < b < c}. Then one hasC(A) ={[ ],[a],[b],[c],[ba],[ca],[cb],[cba]} and the associated partial order is given by the Hasse diagram below.

[cba] ^- [ba] ^- [ca] ^- [cb]

? ?

[a] ^- [b] ^- [c] ^- [ ]

We need one further notation. If P = {p₁ < . . . < p_r} and Q ={q₁ < . . . < q_s} are two subsets of A such that p_r < q₁, then we denote by [Q, P] the unique column of length r+s which is filled by all the elements ofP andQ.

4.2 Complement of a column word

Let P and Q be two subsets of a totally ordered alphabetA. The column [P] is said to be the complement (within A) of the column [Q], denoted by [Q], if and only if one has P = A\Q.

More generally thecomplementof a column word [w] = [P₁]. . .[P_n]∈ C(A)^∗is the column word [w] defined by setting [w] = [Pn]. . . [P1].

We can now give the following important result that will be used in the next section when we will deal with complementation of Young tableaux.

Proposition 4.2 The complement of a non decreasing column word is a non decreasing column word.

6This last encoding is however not one-to-one due to the fact that the empty column is allowed in the context of column words.

(13)

Proof –The proof of our result is based on the two following lemmas, whose proofs can be easily made by suitable inductions and that are left to the reader.

Lemma 4.3 Let P and Qbe two subsets of A such that each element of P is strictly less than each element of Q. Then one has [P][Q, P] .

Lemma 4.4 LetP andQbe two subsets ofAof the same cardinality such that[P][Q]. Then for the complements of the columns constructed on P and Q, one has [Q][P].

Let P and Q be two subsets of A such that each element of P is strictly less than each element of Q. Let also R be another subset of A of the same cardinality than P. Suppose finally that the inequality [Q, P][R] holds. Then as an immediate consequence of the two last lemmas, one has [R][Q, P]. This ends the proof of our result.

Note 4.5 Observe that Proposition 4.2 shows that the complement[T] of the column word [T]

associated with a Young tableau T over A also naturally encodes a Young tableau called the complement (withinA) of T (see [11] p. 140).

4.3 Complementation and plactic equivalence

LetAbe a totally ordered alphabet. Let us then denote byπthenatural projectionofC(A)^∗onto A^∗, i.e. the morphism defined by settingπ([P]) =p_r . . . p₁for every subsetP ={p₁< . . . < p_r} of A. The following result (that can be seen as an extension of Property 3.4 of [11]) gives a simple condition for the plactic equivalence≡to be preserved under complementation. Remind that the length of a column word [u] is its number of columns (not the number of letters in the word π([u])!).

Theorem 4.6 Let [u] and [v] be two column words of C(A)^∗ that have the same length. Then one has π([u])≡π([v])if and only if one has π([u])≡π([v]).

Proof – Let us first prove two lemmas that correspond to two special cases of our theorem (i.e.

the situations where [u] = [ ] [Q, p] and [v] = [Q] [p] for the first lemma, and where all columns involved in [u] and [v] are reduced to letters of A for the second lemma).

Lemma 4.7 Let p be an element of A and let Q be a subset of A such that p is strictly less than each element of Q. Then one has π([p] [Q])≡π([Q, p] [A]).

Proof – Letω(A) be the maximal element of the totally ordered alphabet A and letq⁺ denote the immediate successor in A of any element q ∈ A distinct of ω(A). The interpretation of the plactic equivalence in terms of the column insertion process allows then to show that one has q⁰π([q])≡ π([q])q⁰ when q⁰ 6= q and q π([q]) ≡π([q⁺])q⁺ when q 6= ω(A). We are now in position to prove that one has

π([p] [Q])≡π([Q, p] [A]).

We will start from its lefthand side and use the first of the two identities established above to bring all the elements {q ∈ [Q], q > p} to the left with respect to [p]. In the similar way, we will use the second identity to move all the elements {q ∈[Q], q ≤p} to the left with respect to [p]. This leads to the desired relation and ends the proof.

(14)

Lemma 4.8 Let a₁ . . . a_n∈A^∗ and b₁ . . . b_n∈A^∗ be two equivalent (with respect to the plactic relations) words over A. Then one has π([a_n]. . .[a₁])≡π([b_n]. . . [b₁]).

Proof – This lemma follows immediately from the fact the plactic relations are stable under complementation as shown in the proof of Property 3.4 of [11].

Let now [u] = [P₁]. . .[P_N] and [v] = [Q₁]. . . [Q_N] be two column words of C(A)^∗ of the same length N such that π([u]) = a₁ . . . a_n and π([v]) = b₁ . . . b_n are plactically equivalent words ofA^∗. Let us denote by p_i the number of letters ofA involved in the columnP_i for every 1 ≤ i ≤ N. Using the fact that the identity π([ ]^p¹⁻¹[P₁]. . .[ ]^p^N⁻¹[P_N]) ≡ π([a₁]. . . [a_n]) can be obtained by using only relations of the type [ ] [Q, p]≡[Q] [p], one can now deduce from Lemma 4.7 and from the fact that π([A]) commutes with every letter of A that one has

π([u] [A]^n−N)≡π([P_N] [A]^p^N⁻¹ . . .[P1] [A]^p¹⁻¹)≡π([an]. . .[a1]).

Symmetrically one can also prove that π([v] [A]^n−N) ≡π([b_n]. . . [b₁]) . Using Lemma 4.8, one deduces from these relations that π([u] [A]^n−N) ≡π([v] [A]^n−N), from which it is immediate to obtain that π([u])≡π([v]) according to the interpretation of the plactic equivalence in terms of the column insertion process.

Let us denote byπ_e themirror projection ofC(A)^∗ onto A^∗, i.e. the anti-morphism⁷ defined by setting π([P_e ]) = p₁ . . . p_r for every subset P = {p₁ < . . . < p_r} of A. We are now in the position to show how the column insertion process acts with respect to complementation of column words.

Corollary 4.9 Let [w] be a column word of C(A)^∗ of length n, let T be the Young tableau obtained by applying the column insertion process to π([w])_e and letm be the number of columns of T. Then the Young tableau obtained by applying the column insertion process to eπ([w])is the Young tableau naturally associated with the non decreasing column word [A]^n−m[T].

Proof – Observe first that one must have m ≤n due to the structure of the column insertion process. Our result follows immediately from Theorem 4.6 applied to the two column words [w]

and [T] [ ]^n−m and from the basic properties of the plactic equivalence (cf. Section 2.3).

Example 4.10 Let us take A ={1,2,3} and [w] = [3][21][3]. Then we get π([w]) = 3213 and π([w]) = 3123. Applying the column insertion process toe _eπ([w]), we get the Young tableau

T = 3 2

1 3 .

Hence we have m = 2 and n−m = 1. Observe that [w] = [21][3][21] and π([w]) = 12312._e Applying the column bumping process to π([w]), we get the Young tableau_e

[A] [T] = 3 2 2 1 1 .

7 That is to say a mappinge^π^satisfyinge^π([P¹^]^{. . .}^[Pⁿ^{]) =}e^π([Pⁿ^])^{. . .}e^π([P¹]) for every [P1]. . .[Pn]∈ C(A)^∗.

(15)

5 A bijective proof of Proposition 3.2

Proposition 3.2 gave us the number γ_N of monomials involved in F(χ, δ) in a purely analytic way. In particular, its proof did not provide any insight, neither on the structure ofF(χ, δ), nor on the simplicity of the fact that one hasγ_N = 2^N²⁻¹ which is indeed remarkable. This section will be devoted to the construction of a bijective proof that explains this result more deeply.

This bijection will also help us for studying a number of specializations of Barrett’s formula of practical interest (see Section 6).

5.1 A more general combinatorial structure

Let us first introduce a natural generalization of the combinatorial structures that appeared in Section 3.3, that is to say the setTN of all square tableaux of shape (N^N) divided as in this last section into two complementary Young tableaux (but without any constraint on them) filled by elements of the alphabets δ andχ, respectively. The two Young tableaux that form an element of TN will again be organized as already depicted in Section 3.3. The following picture shows two typical examples of elements ofT6.

χ₆ χ₅χ₄χ₃ χ₂χ₁ δ₅ χ₆χ₅χ₄ χ₂χ₁ δ₄ δ₅ δ₆ χ₅ χ₂χ₁ δ₃ δ₃ δ₄ χ₆ χ₂χ₁ δ₂ δ₂ δ₂ δ₂ χ₂χ₁ δ₁ δ₁ δ₁ δ₁ δ₁ δ₁

δ₆ χ₅ χ₄ χ₃χ₂χ₁ δ₅ χ₆ χ₅ χ₄χ₂χ₁ δ₄ δ₅ δ₆ χ₄χ₃χ₂ δ₃ δ₃ δ₅ χ₅χ₄χ₃ δ₂ δ₂ δ₃ δ₄ χ₄χ₃ δ₁ δ₁ δ₂ δ₂ δ₂ χ₄

Figure 4: Two typical elements of T6 .

As we will see in the sequel, it is in fact possible to construct a bijection betweenTN and the set MN×N({0,1}) of all square {0,1}-matrices of sizeN, which implies that the cardinality of TN is equal to 2^N². It follows then from this last result thatγ_N = 2^N²⁻¹ due to the fact that the number of elements of TN whose first tableau has a first row of length N is obviously (use the symmetry with respect to the main diagonal of the square (N^N) and exchange the role of the alphabetsχ andδ in order to pass from one case to the other) equal to the number of elements ofTN whose second tableau has a first row of lengthN (which means equivalently that the first tableau has a first row of length strictly less than N).

5.2 Description of the bijection

We now present our bijection between MN×N({0,1}) andTN. Our construction is based on a slight variation of the well known Knuth correspondence (cf. Section 2.2) that has an interesting symmetry property which is used to derive some practically important specializations of Barrett’s formula.

Let M be a matrix of MN×N({0,1}). We apply first Knuth’s bijection (as described in Section 2.2) toM in order to get a pair (P, Q) of Young tableaux of conjugate shapesλandλ˜.

We then associate with Q a new Young tableauQ of shape λ (the complementary partition of λwithin the square (N^N)) which is defined as follows.

(16)

• We denote first the length of λby m (or equivalently the number of columns of Q). We then decide (by abuse of terminology) thatQalso has columns indexed by integers strictly greater than mwhich are all empty.

• We can now define a unique tabloidQof shapeλby requiring that for everyi∈[1, N] the i-th column ofQ consists exactly of all the letters of the alphabet {1, . . . , N}, sorted in increasing order from bottom to top, that do not appear in the (N−i+1)-th column ofQ.

Observe that the column word obtained by reading from left to right the columns ofQ(considered here as letters of C({1, . . . N})) is equal to [A]^N−m[Q]. It follows then immediately from Pro- position 4.2 that the tabloid Qis also a Young tableau.

Hence Ψ(M) = (P, Q) is a pair of complementary Young tableaux within the square (N^N).

To obtain from it an element of TN, it suffices to associate with each entryiofP (resp. Q) the letter δ_i (resp. χ_i) of the alphabet δ (resp. χ). We denote by Φ(M) the element of TN that corresponds in such a way to the initial matrix M. Since the mapping Q → Q is one to one, Ψ is clearly a bijection between MN×N({0,1}) and pairs of Young tableaux of complementary shapes over the alphabet [1, N] while Φ is a bijection between MN×N({0,1}) andTN.

Example 5.1 Let us continue Example 2.2. Knuth’s bijection applied to the matrix M in- troduced in this example gives a pair (P, Q) of tableaux of conjugate shapes λ = (1,1,2) and λ˜= (1,3). The shapeλ= (2,3), complementary to the shape λwithin the square(3³), provides the shape of the tableauQ. Filling in its entries by taking (in the reverse order) the complements in {1,2,3} of the entries of the columns of Q, we obtain the tableau

Q= 2 2 1 1 3 .

The element Φ(M) of T3 associated with M is then the following rewriting of the pair (P, Q):

Φ(M) =

δ₃ χ₂ χ₁ δ₂ χ₂ χ₁ δ₁ δ₃ χ₃ .

5.3 Symmetry properties of the bijection

In this section we present of a strong symmetry property of the bijection Φ. We start by giving first a new method for constructing the second Young tableau Q associated by Φ with a given {0,1}-matrix M.

1. Construct the 2-row arrayB_N which results by listing the N² pairs (i, j) of [1, N]×[1, N] in lexicographic order with respect to the second entry, i.e.

B_N = 1 . . . N 1 . . . N . . . 1 . . . N 1 . . . 1 2 . . . 2 . . . N . . . N

! .

2. Select in this array all the entries corresponding to the 0’s of M. We obtain then a word w₂(M) by reading the top components of the selected entries. The result of the column insertion process applied to w₂(M) is a Young tableauQ⁰.

(17)

It turns out that the Young tableauQ⁰ obtained in this way is exactly the second Young tableau Q constructed by the bijection Ψ, presented in Section 5.2, when applied to the matrixM.

Proposition 5.2 Let M be a matrix of MN×N({0,1}), let Q be the second Young tableau constructed by the bijection Ψ applied to M and let Q⁰ be the Young tableau constructed as above. Then one has Q⁰ =Q.

Proof – Let M be a matrix of MN×N({0,1}) and let ^tM be its transpose matrix. Let A^e_N be the 2-row array associated with ^tM as defined in Section 2.2 and let B_N be the 2-row array associated with M as defined above. Let us then associate with these two 2-row arrays the two following column words [u(M)] and [v(M)] of length N defined by setting:

• [u(M)] = [I₁]. . .[I_N] whereI_i denotes the sequence (possibly empty) of the entries, written from right to left, of the second row ofA^e_N corresponding to the 1’s of thei-th row of ^tM;

• [v(M)] = [J_N]. . .[J₁] whereJ_idenotes the sequence (possibly empty) of the entries, written from right to left, of the first row ofB_N corresponding to the 0’s of thei-th column ofM. For instance, if we take the matrixM of Example 2.2, we have [u(M)] = [2] [3] [31] (cf. Example 2.4) and [v(M)] = [2] [21] [31] (cf. Example 5.3 that follows).

The reader can now check that one always has [v(M)] = [u(M)] (as can be observed in the previous example). Our proposition follows then from Corollary 4.9 due to the fact that Q is the result of the column insertion process applied tow_e₁(M) =π([u(M_e )]) according to Theorem 2.3 and that Q⁰ is the result of the column insertion process applied to w₂(M) = π([v(M)])_e according to the construction presented above.

Example 5.3 This example continues Example 2.2 and Example 5.1. In this case, we have:

B₃ = 1 2 3 1 2 3 1 2 3

1 1 1 2 2 2 3 3 3

!

where we boxed the entries that correspond to the0’s of the associated matrixM. Hencew₂(M) = (1,3,1,2,2). Then the column insertion process applied to w₂(M) gives the Young tableau:

Q⁰ = 2 2

1 1 3 =Q .

The following symmetry result is now an immediate consequence of the new intepretation of the bijection Ψ that follows from the construction given above and Theorem 2.3.

Corollary 5.4 Let M be a matrix ofMN×N({0,1})and let(P, Q⁰)be the result of the bijection Ψ applied to M. Then the result of the bijection Ψ applied to the matrix s_0,1(^tM) obtained by exchanging the 0’s and the 1’s in the transpose matrix ^tM of M is equal to(Q⁰, P).

Example 5.5 Let us consider again the matrix M of Example 2.2. Then one has:

s_0,1(^tM) =





1 0 1 1 1 0 0 1 0



 .

The reader can then easily check that w₁(s_0,1(^tM)) = (1,3,1,2,2)andw₂(s_0,1(^tM)) = (3,1,2,3) from which it follows that (taking here again all the notations of the previous examples):

Ψ(s_0,1(^tM)) =



 2 2 1 1 3 ,

3 2 1 3



= (Q⁰, P) .

(18)

6 Some specializations of Barrett’s formula

In this section we show how the bijection constructed in Section 5 can be effectively used to find explicit expressions for several specializations of Barrett’s formula.

6.1 Matrices involved in the combinatorial version of Barrett’s formula Let us denote by NN the set of all square matrices M of MN×N({0,1}) such that the length of the first row of the first Young tableauP associated with M by the bijection Ψ (constructed in Section 5.2) is exactly equal to N. Furthermore, let µ(t) stand for the monomial obtained by taking the product of all entries of an elementtofTN. According to the results of Section 3.3, the symmetric polynomial F(χ, δ) defined by relation (5), i.e. the nominator of the combinatorial expression (4) of the probability of error (1), can be expressed as

F(χ, δ) = ^X

M∈NN

µ(Φ(M)), (8)

where Φ stands for the second bijection constructed in Section 5.2.

In order to better understand the combinatorial version of Barrett’s formula, we will explore the fine structure ofNN. Let againM be a matrix ofMN×N({0,1}). Observe that the length of the first row of the Young tableau P associated by Ψ with M is exactly the length of the longest non increasing subsequence inw₁(M) according to Greene’s theorem (cf. [8] or Chapter 3 of [7]) and to the construction of P (cf. Section 2.2). Since a non increasing subsequence in w₁(M) corresponds to a strictly increasing subsequence, for the North-East order ⁸, in the set of the entries of M associated with 1’s, we get the following characterization of NN.

Proposition 6.1 A matrix M ∈ MN×N({0,1}) belongs to NN if and only if there exists a sequence of 1’s of length N in M such that the corresponding entries form a strictly increasing sequence (of length N) in the North-East order.

Example 6.2 Let us consider again the matrix of Example 5.5, denoted here by M⁰, i.e.

M⁰ =







1 0 1

1 1 0

0 1 0





 .

The entries associated with the three 1’s of M⁰ boxed on the above picture correspond to the strictly increasing sequence (3,2)≺N E (2,2)≺N E (1,3) in the North-East order. According to Proposition 6.1, M⁰ belongs therefore to N3, which just means that the length of the first row of the first tableau associated byΨtoM⁰ is equal to3 as it can be directly checked in Example 5.5.

Note 6.3 Let M be a matrix of MN×N({0,1}). Then for every k∈[1, N], let us consider:

• the largest number L₀(M, k) that can be realized as the sum of the lengths of k disjoint sequences (possibly empty) of 0’s in M such that the corresponding sequences of entries are strictly increasing for the South-East order; ⁹

8 We define the North-East order≺N E over [1, N]²by setting (i, j)≺N E(k, l) if and only ifi > k andj≤l.

9 We define the South-East order≺SE over [1, N]² by setting (i, j)≺SE(k, l) if and only ifi < k andj≤l.

(19)

• the largest number L₁(M, k) that can be realized as the sum of the lengths of k disjoint sequences (possibly empty) of 1’s in M such that the corresponding sequences of entries are strictly increasing for the North-East order.

We also define by convention L₀(M,0) =L₁(M,0) = 0. Greene’s theorem (cf. [8] or Chapter 3 of [7]) used in connection with the constructions of Sections 2.2 and 5.3 shows that

• L₀(M, k)−L₀(M, k−1) is equal to the length of the k-th column of Q⁰,

• L₁(M, k)−L₁(M, k−1) is equal to the length of the k-th row of P,

for every k ∈[1, N], if we set Ψ(M) = (P, Q⁰). Proposition 5.2 then implies that the following simple, but surprising, identity always holds for every k∈[1, N]:

L₀(M, k)−L₀(M, k−1) +L₁(M, N−k+1)−L₁(M, N−k) =N .

As an illustration of these results, let us again consider the matrix M of Example 2.2, i.e.

M =







0 0^m1 1 0^m0

0^@@ 1^m 1^@@





 .

Then one has L₀(M,1) = 2, L₀(M,2) = 4, L₀(M,3) = 5, L₁(M,1) = 2, L₁(M,2) = 3, L₁(M,3) = 4 (the corresponding subsequences of 0’s and 1’s are boxed, circled and triangled in the above picture) from which it is easy to check all the results of this note.

LetM be a matrix ofNN. According to Proposition 6.1 and to the definition of the North- East order, there exists a sequence σ of length N of 1’s in M such that the corresponding sequence of entries has the form σ⁰ = ((N−k+ 1, j_k))1≤k≤N where (j_k)1≤k≤N stands for an increasing sequence of integers of [1, N]. One can obviously encode such a sequence of 1’s by the pseudo-composition ¹⁰ p(σ) = (p_k)_1≤k≤N ofN defined by letting p_k to be the number (possibly equal to zero) of 1’s of σ that belong to the k-th column of M ¹¹. We denote by p(M) the greatest (in the lexicographic order on ^N^N) pseudo-composition that can be associated in such a way withM. The set NN can then be partitioned as

NN = ^[

p∈PN

Np,N (9)

where PN denotes the set of all pseudo-compositions of length N of N and whereNp,N stands for the set of all matrices M ∈ NN whose associated pseudo-permutationp(M) is equal top.

Let us now associate with every pseudo-compositionp= (p₁, . . . , p_N) of PN the integer µ(p) defined as the smallest element µ of [1, N] such that p₁ +. . .+p_µ =N. The following result gives a fine characterization of the matrices of Np,N.

Proposition 6.4 Let p = (p1, . . . , p_N) be a pseudo-composition of PN. Furthermore, let also (j_k)_1≤k≤N denote the unique increasing sequence of integers defined by demanding every k in [1, N] to be repeated p_k times. A matrix M belongs to Np,N if and only if it satisfies the two following properties:

10 A pseudo-composition of an integerN is a sequence of nonnegative integers (including 0) whose sum isN.

11 The sequence (jk)1≤k≤N that characterizesσ⁰ (or equivalentlyσ) as described above, is indeed the unique increasing sequence ofN elements of [1, N] obtained by repeating each integerk∈[1, N] exactly pktimes.