On the achievable throughput of a multi-antenna Gaussian broadcast channel

(1)

Gaussian Broadcast Channel

Giuseppe Caire and Shlomo Shamai (Shitz)

July 23, 2001

Abstract

AGaussianbroadcastchannelwithr single-antennareceiversandtantennasat

the transmitter is considered. Both transmitter and receivers have perfect knowl-

edge of the channel. Despite apparent simplicity, this model is in general a non-

degraded broadcast channel,for which the capacity regionis not fully known. We

proposeanoveltransmissionscheme basedon \ranked knowninterference"(RKI).

In brief,thetransmitterdecomposes thechannelintoan ordered(orranked)setof

interferencechannelsforwhichtheinterferencesignalinthei-thchannelis alinear

combination of the signals transmitted in channels j < i. Since all transmitted

signalsaregeneratedbythetransmitter,interference ineachchannelisknownnon-

causally. Hence, known techniques of coding for non-causally known interference

can be applied to make the interference in each channel harmless without further

powerpenalty. Weshowthattheproposedschemeisthroughputwiseasymptotically

optimalforbothlowand highSNR,and wecomparethethroughput achievable by

RKIwiththethroughputachievablebymoreconventionalzero-forcingbeamforming

(space-division multiple access) and with the throughput of a single-user multiple

antenna system obtained by allowing the receivers to cooperate. Also, we provide

a modicationof the basicRKI scheme which achieves optimalthroughput for all

SNRs in the special case of two users. For independent Rayleigh fading, closed-

form throughputexpressionsareobtainedforthebasicRKIstrategyand arbitrary

t and r. Numerical resultsare shown forniter;t and inthe large-systemlimitof

r;t!1 withxedratio =r=tuserspertransmitantenna.

Keywords: Gaussianbroadcast channel,multiple-antennasystems.

1 Introduction

We consider a wirelesscommunication system with one transmitter and r receivers. The

transmitter has t antennas, while the receivers have one antenna each. The transmitter

has todelivertoeachreceiverindependentinformation,asinthe downlinkof asingle-cell

system where the base-station is equipped with anantenna array of t elements.

(2)

Weconsider thecasewherethe channeltransferfunctionbetween thetransmitterand

all receivers is perfectlyknown toall terminals. Our main results are independent of the

statistics of the channel and they can be directly applied to the more general multi-cell

downlink scenario with full cooperation between all base-station antennas. However, in

thisworkwedonotconsiderexplicitlytheeectofthespatialdistributionofthetransmit

and receive antennas, asfor example inthe cellularmodelof [1, 2].

This channel is referred to inthe following as the t1: r GBC (to be read \ttimes

1 to r Gaussian Broadcast Channel "), in order to stress the fact that the r receivers

must process their signals separately, as opposed to a t r multi-antenna single-user

system where the signals at the r receive antennas can be processed jointly [3]. Despite

apparentsimplicity,thismodelisingeneralanon-degraded broadcast channel,forwhich

the capacity regionis not fully known.

Conventional approaches to the t1 : r GBC have been proposed by many authors

under dierent assumptions on the channel knowledge at the transmitter. They are all

based on some form of linear precoding of the user signals. In [4], the multipath channel

knowledge atthe transmitterisused topre-lterthe signalof eachuser by itsown space-

time matched lter, in order to induce at the receiver the Maximal-Ratio Combining

(MRC) [5] of the multipath without having to implementthe matched lter in the user

terminal. We refer to this approach as MRC beamforming.

1

In [6, 7, 8, 9, 10, 11] joint

linear precoders are foundinorder tomaximizethe signal-to-interferenceplus noise ratio

attheuserterminalssubjecttovariousconstraints. Theseschemes,usuallyproposedina

CDMAmultiple-antennasingle-celldownlinkscenario,arebasedonlineartransformations

of the individual user signals which essentially null-out interference while combining the

multipath of the useful signal for each user. In our frequency-at unspread channel

model, this reduces to inverting the channel at the transmitter by using the Moore-

Penrose pseudoinverse [12] of the channel matrix. This approach shall be referred to in

the following asZero-Forcing(ZF)beamforming.

An upper bound on the achievable throughput of the t 1 : r GBC is obtained by

letting the receivers cooperate. In this case, the channelreduces to asingle-user multiple

antenna situation, for which the capacity with perfect channel knowledge at both ends

is well-known [3, 13]: the channel can be decomposed into a set of parallel channels

corresponding to its eigenmodes without loss of information. It is worthwhile to notice

that for both ZF beamforming and the cooperative system the optimal coding strategy

reduces tostandard single-user Gaussiancoding, and the multiple-inputmultiple-output

natureofthesystemiscapturedbysimplelinearsignalprocessingatthetransmitteronly

(for ZF) orat both transmitter and receiver(for cooperative).

A quitedierent coding strategy is proposed here for the t1:r GBC.Our scheme,

referred to as Ranked Known Interference (RKI), makes use of linear precoding in order

to decompose the channel into a ranked set of interference channels for which the inter-

ference signal in the i-th channel is a linear combination of the signals transmitted in

channels j < i. Since all signals are generated by the same transmitter, interference in

1

IntheCDMAliteraturethisis alsoknownas\pre-rake".

(3)

each channel is known non-causally. Then, known techniques of coding for non-causally

known interference [14, 15, 16, 17, 18] can be applied to make the interference in each

channel harmless without further power penalty. In particular, the recently proposed

scheme of [17] based on modulo-lattice precoding can be applied here. Interestingly, a

version ofour RKIscheme with suboptimal one-dimensionallatticeprecoding,analogous

to Tomlinson-Harashima precoding (see [19] and references therein), has been recently

and independently proposed in [20, 21] for the transmitter-based cancellation of far-end

cross-talk inDSL.

We show that the RKI strategy is throughputwise asymptotically optimal for both

low and high SNR, and it is equivalent to MRC beamforming to the best user only for

low SNR and to the cooperative scheme for high SNR. Also, we provide a modication

of the basic RKI scheme which achieves optimal throughput for all SNRs in the special

case of twousers. In the case of Rayleighfading, closed-formthroughput expressions are

obtained for the basic RKI strategy for both nite r;t and in the large-system limit of

r;t!1 with xed ratio =r=t users pertransmitantenna.

The paper is organized as follows. In Section 2 the channel model is dened. In

Section 3 the basic RKI strategy and its properties are illustrated. Section 4 presents

upperand lowerbounds onthe achievable throughputof the t1:r GBCand Section5

shows that a modication of the basic RKI strategy is actually optimal for the t1: 2

GBC. In Section 6 results for the basic RKI scheme in independent Rayleighfading are

given. Finally,Section 7summarizes our conclusions.

Notation.

LetAbeamatrix. Thei-throw,j-thcolumnand(i;j)-thelementofAaredenoted

by a i

, a

j

and a

i;j

or equivalently by [A]

i;j

, respectively.

The submatrixobtained by the rows of A numbered by i 2S, where S isan index

set, is denoted by A[S].

Superscripts T

and H

denote transpose and Hermitiantranspose, respectively.

zN

C

(;) indicatesthat z isacomplex circularly-symmetricGaussianrandom

vector with mean and covariancematrix .

[x]

+

=maxf0;xg.

2 System model

Weconsider a discrete-timecomplexbaseband channel modeland assumethat the prop-

agation channel between each transmit-receive antenna pair is frequency non-selective.

Extension to the time-invariantnite-memory frequency-selective case is easilyobtained

in the frequency domain(see [22] andreferences therein) and willbe briey addressed in

Appendix A. The t1:r GBC isdescribed by

y =Hx +z (1)

(4)

where x

i 2 C

t

is the transmitted symbol vector at time i, y

i

and z

i

N

C

(0;I) are the

corresponding vectors of received signal and noise and H 2 C rt

is the channel matrix,

where h

k;`

is the complex channel gain from transmit antenna ` to the receiver antenna

of user k.

The input isconstrained tosatisfy

1

n n

X

i=1 jx

i j

2

A (2)

whereAisthemaximumallowedtotaltransmitenergyperchanneluse. Sinceweconsider

normalized unit noise variance, A takes onthe meaning of totaltransmitSNR.

Foragivenblocklengthn, acode C

n

forthe input-constrainedt1:rGBCisdened

by a codebook of exp (n P

r

k=1 R

k

) code arrays of the form X= [x

1

;:::;x

n ]2 C

tn

, such

that (2)is satisedforeach array,byanencodingfunction mappingr-tuplesofindexes

(w

1

;:::;w

r

) (where w

k

2 f1;2;:::;exp (nR

k

)g) onto the code words, and by r decoding

functions

1

;:::;

r

, such that

k : C

n

! f1;2;:::;exp (nR

k

)g. The received signal at

the k-th receiver antenna is given by the k-th row y k

of the array Y = HX+Z. The

error probability for the k-th user is given by

k

= Pr(

k (y

k

) 6= w

k

). A r-tuple of rates

R=(R

1

;:::;R

r

)isachievableifthereexistasequenceofcodesC

n

withratesapproaching

R and vanishing

k

(for all k = 1;:::;r), as n increases. The system throughput R ,

measured in bit/channeluse (or bit/s/Hz),is dened as the rate sum

R= r

X

k=1 R

k

(3)

We assume perfect Channel State Information at the transmitter (CSIT) and at all re-

ceivers (CSIR), i.e., the channel matrix H is known to everybody, and consider the fol-

lowingscenarios:

1. H isdeterministic and xed.

2. H is xed during the transmission of each code array, but it is randomly and in-

dependently selectedaccording toagiven probabilitydistribution(compositechan-

nel). In thiscase [23, 24], wecan consider both the short-termconstraint(2)orthe

long-term constraint

E

H

"

1

n n

X

i=1 jx

i j

2

#

A (4)

where E

H

denotes expectationwith respect to H.

3. H is generated by an ergodic matrix random process and varies during the trans-

mission of eachcode word so that the channel is informationstable [24, 25].

Since R depends on H, the instantaneous throughput of the composite channel (case 2)

is a random variable. In this case we consider the average throughput

R =E [R ]. This

(5)

is achievable by a variable-rate coding scheme which adjusts its instantaneous through-

put according to the channel realization H. By the law of large numbers, the average

throughputachievableoveralongsequence of channelrealizationsisgiven by

R . More in

general,the sameresult appliestothe case ofablock-fadingchannelwhere Hisconstant

over blocks of n consecutive channel uses and changes from block to block according to

an ergodic matrix randomprocess [24].

Withperfect CSIT and CSIR, it turns out that the information stable channel (case

3) has the same averagethroughput of the composite channel with long-termpowercon-

straint(4)(althoughcodinganddecodingstrategiesanderrorexponentsforthesechannel

are generallydierent[24]). Therefore, weshall focus on cases 1 and 2only.

The 11:r GBC coincides with the classical degraded Gaussianbroadcast channel,

whose capacity region is well-known (see [26] in the deterministic case and [27, 28] in

the composite or ergodic cases). However, the t1 : r GBC for t > 1 is in general a

non-degraded broadcast channel, for which the capacity region is not fully known, and

cannot be reduced to anequivalent set of parallel degraded broadcast channels (studied

in [29, 30, 31, 27, 22, 28]). Obviously,an inner bound tothe capacity region is provided

by exhibiting an explicit coding scheme for which the error probabilities vanish as the

block lengthincreases.

It is interesting to notice here that perfect CSIT is a key fact in our model. For

example, under the assumption of no CSIT and of symmetric ergodicchannel, where H

is random with independently and identically distributed rows, the marginal transition

pdfs p(y

k;i

;Hjx

i

) are identical for all k = 1;:::;r, and the resulting t1 : r GBC is

stochastically degraded [26] also for t > 1. In this case, the capacity region is trivially

obtained by time-sharingand the optimalthroughput coincides with the t1\transmit

diversity" capacity obtained in[3], explicitlygiven by

R tx div

=E[log(1+jh 1

j 2

A=t)] (5)

Wehastento saythat the cases ofperfectCSIT and of symmetricchannelwithoutCSIT

are not the only possibilities. Other settings might also yield new and interesting prob-

lems.

3 The RKI coding strategy

Let H=GQ be a QR-type decomposition[12] obtained by applyingthe Gram-Schmidt

orthogonalization procedure [12] to the rows of H. Letm =rank(H), then G2 C rm

is

lowertriangular(i.e.,ithaszerosaboveitsmaindiagonal)andQ2C mt

hasorthonormal

rows. By letting the transmitted i-th signal vector be given by x

i

= Q H

u

i

, the original

channelis turnedintothe set of m interference channels

y

k;i

=g

k;k u

k;i +

X

j<k g

k;j u

j;i +z

k;i

; k =1;:::;m (6)

(6)

whilenoinformationissenttousersm+1;:::;r. Theinputconstrainttranslates directly

to the new input signal u

i

, infact,

1

n n

X

i=1 jx

i j

2

= 1

n n

X

i=1 u

H

i QQ

H

u

i

= 1

n n

X

i=1 ju

i j

2

(7)

The interference signal s

k;i

= P

j<k g

k;j u

j;i

in the k-th interference channel (6) is given

by a linear combination of the transmitted signals u

j;i

in channels j < k. Since all

these signals are generated at the transmitter, and the coeÆcients g

k;j

are known, the

interference signal ineach channel isknown non-causally by the encoder (if the channels

are considered inthe order1;:::;m dictated by the QRdecomposition). Forthis reason,

we refer to this scheme as Ranked Known Interference strategy. We have the following

result:

Theorem 1. Forgiven H=GQ, with d

k

=jg

k;k j

2

,the maximumthroughput achieved

by the RKI strategy isgiven by

R rki

= m

X

k=1

[log(d

k )]

+

(8)

where is the solution of the waterllingequation [32]

m

X

k=1

[ 1=d

k ]

+

=A (9)

Proof. See Appendix B.

Corollary 1. For the composite channel we have

R rki

= E

H [R

rki

] where solves the

short-terminputconstraint(9),orthelong-terminputconstraintE

H [

P

m

k=1

[ 1=d

k ]

+ ]=

A.

Remark: on the capacity ofchannelswith interference knownto the transmit-

ter. InAppendixB,weoutlineacodingschemebasedonmodulo-latticeprecoding,rst

introducedby Erez Shamai and Zamir[17], whichis able toachieve the RKI throughput

of Theorem 1 and Corollary 1.

Also,inAppendix B we investigate the relationbetween the RKIscheme and Costa's

\writingondirtypaper"capacityresult[15],andweshowthatthethroughputofTheorem

1andCorollary1canbealsoachieved byasimplegeneralizationofCosta'sscheme,which

is quite dierent from the lattice precoding scheme of [17]. The main dierence between

the two schemes is that Costa's relies on the fact that the known interference signal is

Gaussian, i.i.d., with given (known) variance, while lattice precoding is universal in the

sense that it works for any arbitrary interference signal sequence, provided that it is

(7)

interference realization, but need not know the interference statistic). The link between

Costa's \writing on dirty paper" capacity and the lattice precoding scheme is provided

in Appendix B by combining a well-known result of Ahlswede on the AVC with random

parameters [16] non-causally known atthe transmitter witha recent resultof Cohenand

Lapidoth [18].

Forthe sake of comparison,we review here the ZF beamformingand the cooperative

schemes. ZF beamforming consists of inverting the channelmatrix atthe transmitter in

order to create orthogonal channels between the transmitter and the receivers without

receivers cooperation. Let S f1;:::;rg be a subset of cardinality m for which the

corresponding submatrixH[S] isfull row-rank, i.e., rank(H[S])=jSj. Let

H +

S

=H[S]

H

H[S]H[S]

H

1

(10)

betheMoore-Penrosepseudoinverse [12]ofH[S]. InZFbeamforming,thetransmitsignal

is obtained as x

i

=H +

S u

i

and yieldsthe set of parallelchannels

y

k;i

=u

k;i +z

k;i

; k 2S (11)

while no information is sent to users k 2= S. Clearly, the maximum ZF throughput is

obtained by u

i

Gaussianwith independent components.

The (short-term)input constraint isgiven by

1

n n

X

i=1

trace H +

S u

i u

H

i (H

+

S )

H

= trace (H +

S )

H

H +

S 1

n n

X

i=1 u

i u

H

i

!

! jSj

X

k=1

2

u;k

b

k

(12)

where we letE[u

i u

H

i

]=diag(

2

u;1

;:::; 2

u;jSj

) and where we dene

b

k

=

1

(H[S]H[S]

H

) 1

k;k

(13)

The maximum ZF throughput is obtained asthe solutionof the optimization problem

(

max P

jSj

k=1

log(1+ 2

u;k )

subjectto P

jSj

k=1

2

u;k

b

k

=A

which can be conveniently reparameterizedby lettinga

k

=

2

u;k

b

k

and yields

R zf

= jSj

X

k=1

[log(b

k )]

+

(14)

where is the solution of P

jSj

[ 1=b

k ]

+

=A.

(8)

The above ZF maximum throughput can be further optimized with respect to the

choice of the active user set S. In particular, by optimizing over the sets of cardinality

one, we obtaina system basedon MRC beamformingtothe user with highestindividual

channel capacity, whose throughputis given by

R mrc

=log 1+jh max

j 2

A

(15)

where h max

is the row of H with largest 2-norm. Interestingly, for t>1 this choice does

not yield necessarilythe largest throughput, in sharp contrast to the standard degraded

broadcast channel(t=1) forwhich the throughput ismaximized by transmitting tothe

best user only [27].

Now, we consider a single-user multiple antenna system which is obtained from our

systembyallowingtherreceivers tocooperate[3]. LetH=USV H

betheSingularValue

Decomposition[12]ofH,whereU2C rr

andV 2C tt

areunitaryandSisdiagonalwith

non-negative diagonalelements,the rst m of which are strictly positive and denoted by

p

c

1

p

c

m

. Bylettingx

i

=Vu

i

and v

i

=U H

y

i

the channel(1)is diagonalizedas

v

i

=Su

i +z

0

i

(16)

where z 0

i

= U H

z

i

has the same statistics of z

i

. The cooperative throughput is also

maximized by u

i

Gaussian with independent components and itis given by [3]

R coop

= m

X

k=1

[log(c

k )]

+

(17)

jSj

k=1

[ 1=c

k ]

+

=A.

The ZF and cooperative throughputs for the composite channel (with short or long-

term power constraints)are immediatelyobtained from(14) and(17) by taking expecta-

tion with respect to H.

Remark: on theuserordering problem. SinceforanyunitarymatrixQthematrix

QHhasthesamesingularvaluesofH,R coop

isobviouslyindependentoftheuserordering

(permutation matrices are unitary). On the contrary, R zf

depends on the choice of the

unorderedactive userset S, andR rki

depend onthe orderedactive userset S(whoserows

are considered in order to perform Gram-Schmidt orthogonalization). In order to stress

this dependence, weintroducethe following notation

R rki max

= max

S R

rki

R zf max

= max

S R

zf

(18)

where in the rst lineS ranges over the ordered user sets with cardinality m, and inthe

second lineS ranges over the unordered user sets with cardinalityjSjm.

Ifrank(H)=m thenR rki max

isachieved by anorderedset of musers, forevery SNR

(9)

set S 0

of cardinalityk <m,such that H[S 0

]=G 0

Q 0

. Then, there exists an ordered set of

users S =S 0

[fi

1

;:::;i

m k

g such that H[S] =GQ where jg

i;i j

2

= jg 0

i;i j

2

for i = 1;:::;k

and jg

i;i j

2

>0 for i=k+1;:::;m. Therefore, the RKI strategy applied to H[S 0

]and to

H[S] yields the same throughput.

Also,theuserorderingisirrelevantfortheRKIstrategyforasymptoticallylargeSNR

ifHisfullrow-rank,i.e.,ifm=r. Infact,letfd

1

;:::;d

r

gandfd 0

1

;:::;d 0

r

gbethetwosets

ofordered squareddiagonalelementsofthe matricesGandG 0

intheQRdecompositions

H = GQ and H = G 0

Q 0

, respectively, where is a rr permutation matrix. We

have that

r

Y

i=1 d

i

=jdet(G)j 2

=det (HH H

)=det (HH H

H

)=jdet(G 0

)j 2

= r

Y

i=1 d

0

i

Dene the following arithmetic means

M

a

= 1

r r

X

i=1 1

d

i

; M

0

a

= 1

r r

X

i=1 1

d 0

i

and the geometric mean M

g

=( Q

r

i=1 1=d

i )

1=r

=( Q

r

i=1 1=d

0

i )

1=r

. There existA

0

<1 such

that the equation

r

X

i=1

[ 1=d

i ]

+

=A

0

has solution

0

=A

0

=r+M

a

and the equation

r

X

i=1

[ 1=d 0

i ]

+

=A

0

has solution 0

0

=A

0

=r+M 0

a

. Then, for allA A

0

, the RKI throughputs corresponding

to the original and permuted row orders are given by rlog

A=r+Ma

Mg

and by rlog A=r+M

0

a

Mg ,

respectively, implying that their dierence vanish asA !1.

WithZFbeamforming,foragivenSNRAthemaximumthroughputR zf max

mightbe

achieved by auser subset Sof cardinality strictly less than m=rank(H). However, it is

easytoseefromthepropertiesofthewaterllingpowerallocationin(14)thatthereexists

a nite value A

0

(whichdepends onH) forwhich, forall AA

0 , R

zf max

isachieved by

a subset of cardinality m.

Sincebyconstrainingthereceivers toprocesstheirsignalsindependentlythethrough-

put cannotbeincreased,R coop

upperboundsboth R zf max

andR rki max

. The RKIscheme

yields generally a larger maximal throughput than ZF beamforming, as stated in the

following:

Theorem 2. Forany channel matrix H,R rki max

R zf max

.

(10)

Proof. Assume that,afterasuitablerowpermutation,therst krows ofHarelinearly

independent, choose the user subset S=f1;:::;kg. The columns v

i of H

+

S

satisfy

h j

v

i

=Æ

i;j

; j =1;:::;k

Therefore, v H

i

must lie in the orthogonal complement of the subspace V

i

= spanfh j

:

j = 1;:::;k;j 6= ig. Let P

?

i

be the orthogonal projector [12] on V

?

i

. From the above

orthonormalitycondition we get

v H

i

= h

i

P

?

i

h i

P

?

i (h

i

) H

The inverse of the i-th diagonalelement of (H[S]H[S]

H

) 1

=(H +

S )

H

H +

S

is given by

b

i

= 1

jv

i j

2

= jh

i

P

?

i (h

i

) H

j 2

h i

P

?

i (h

i

) H

=jh i

P

?

i j

2

(19)

where we used the fact that orthogonal projectors are idempotents [12]. The rows q i

of

Q inthe QRdecomposition H =GQ are obtained by applying Gram-Schmidt orthogo-

nalizationto the ordered rows h 1

;h 2

;:::;h k

. We obtain

h i

= q

h ie

P

?

i (h

i

) H

q i

+ i 1

X

j=1 h

i

(q j

) H

q j

where e

P

?

i

istheorthogonalprojectorontheorthogonalcomplementof e

V

i

=span fh 1

;:::;h i 1

g.

From the denition of d

i

in(8)and the formulaabove we obtain

d

i

=h i

e

P

?

i (h

i

) H

=jh i

e

P

?

i j

2

(20)

SinceV

i

e

V

i

,then b

i d

i

foralli=1;:::;k. Finally,since bothk andthe userordering

were arbitrary, this implies that R zf max

R rki max

.

Weconclude that R coop

R rki max

R zf max

holdsfor anychannelmatrix. The next

result makes this statement stronger in the limitsfor high and lowSNR.

Theorem 3. Forany channel matrix Hwith full row-rank,

lim

A!1 R

coop

R rki max

=0 (21)

For any channel matrix H,

lim

A!0 R

rki max

R zf max

=1 (22)

(11)

Proof. Consider rst (21). Letc

1

;:::;c

r

denote the (non-zero) squared singularvalues

of H and d

1

;:::;d

r

the (non-zero) squared diagonalelements of G in the QR decompo-

sition H=GQ. Dene the followingarithmetic means

M

a

= 1

r r

X

i=1 1

c

i

; f

M

a

= 1

r r

X

i=1 1

d

i

and the geometric mean M

g

= ( Q

r

i=1 1=c

i )

1=r

= ( Q

r

i=1 1=d

i )

1=r

, where the last equality

follows from

r

Y

i=1 c

i

=det(HH H

)=jdet(G)j 2

= r

Y

i=1 d

i

It is immediate tosee that there existA

0

<1 suchthat the equation

r

X

i=1

[ 1=c

i ]

+

=A

0

has solution

0

=A

0

=r+M

a

and the equation

r

X

i=1

[ 1=d

i ]

+

=A

0

has solution e

0

=A

0

=r+ f

M

a

. Then, for all A A

0

, the maximum throughputs can be

written as

R coop

= rlog

A=r+M

a

M

g

R rki

= rlog

A=r+ f

M

a

M

g

(23)

By substituting these expressions in the limit(21) we obtain

lim

A!1 rlog

1+rM

a

=A

1+r f

M

a

=A

=0

In order to show (22), consider rst the case where there is a single row h max

in H

of maximum squared Euclidean norm. We notice both RKI and ZF achieve the MRC

throughput R mrc

=log(1+jh max

j 2

A), by choosing anactiveuser set containing onlythe

user correspondingtothe rowh max

. LetSdenotes anarbitraryuser subsetof cardinality

k for which H[S] has rank k,let

b

i (S)=

1

(H[S]H[S]

H

) 1

i;i

(12)

and let R zf

(S) denote the maximum of P

k

i=1

log(1 +b

i (S)a

i

) subject to P

k

i=1 a

i A,

a

i

0. There exists A

1

(S) >0 such that, forall AA

1 (S),

R zf

(S)=log(1+max

i fb

i

(S)gA)

Hence, by denition of h max

, for all A A

1

(S) we have R zf

(S) R mrc

. By considering

all possible subsets S of cardinality k = 1;:::;m we conclude that R zf max

= R mrc

for

0<Amin

S A

1 (S).

Similarly,consider an ordered set S of cardinality m, let d

1

(S);:::;d

m

(S) denote the

squared diagonal elements of G in the QR decomposition H[S] = GQ and let R rki

(S)

denote the maximum of P

m

i=1

log(1 +d

i (S)a

i

) subject to P

m

i=1 a

i

= A. There exists

A

2

(S) >0such that, forall AA

2 (S),

R rki

(S)=log(1+max

i fd

i

(S)gA)

Hence,bydenitionofh max

,forallA A

2

(S)wehaveR rki

(S)R mrc

. Byconsideringall

possiblesuchsubsets Sweconclude that R rki max

=R mrc

for 0<Amin

S A

2

(S). Then,

there exists anA

3

>0 such that for A2[0;A

3

] wehave R rki max

=R zf max

=R mrc

.

In the case where H has more than one row with maximum squared Euclidean norm

we have to distinguish the case where there exists a subset of mutually orthogonal rows

withmaximalnormfromthecasewhereanysubsetofthemaximalnormrowsismutually

non-orthogonal. In the latter case, the above proof stillholds, and the MRC throughput

can be obviously achieved by transmitting to anyone of the users corresponding to the

maximalnormrows. Intheformercase, itisnotdiÆculttoshowthat thereexists A

4

>0

for which for every A 2 [0;A

4

] both the ZF and the RKI throughputs are maximized

by transmitting with equal power to the users corresponding to the subset of mutually

orthogonal rows with maximalnorm. This concludes the proof.

4 Throughput bounds for the t 1 : r GBC

Inthis sectionweconsider Hdeterministicand xedandwend upperandlowerbounds

to the maximum achievable throughput R of the corresponding t 1 : r GBC. These

allowus toestablishthe asymptoticoptimalityof the RKIscheme for highand lowSNR

provided that the channel matrix has full row-rank. For nite and non-vanishing SNR

these bounds are generally diÆcultto evaluate explicitly,however, they will beuseful to

prove the maximum throughputfor the t1:2 GBC (two-users case) inSection 5.

An upperbound on R is obtained by noticing that the capacity region of a general

broadcast channel depends only on the marginal transition probabilities fp(y

k

jx) : k =

1;:::;rg and not on the joint transition probability p(yjx) [33, 26]. In our case, the

marginaltransitionpdfs are given by

p(y

k jx)=

1

e jy

k h

k

xj 2

; k =1;:::;r

(13)

Any set of marginaltransitionpdfs

p 0

(y

k jx)=

1

k e

jy

k h

k

xj 2

=

k

; k =1;:::;r

with

k

1 yields a GBC capacity region containing that of the original GBC, since

any user k in the new channel can emulate articially the corresponding output of the

originalchannelby addingindependent Gaussiannoisewith variance1

k

. This implies

that the channelsinthe family(1)for given Hand with zN

C (0;

z

), where

z

isany

non-negativedeniteHermitianmatrixwhosediagonalelementsarenotlargerthan1(we

refer to this constraint as the sub-unit diagonal constraint), have all broadcast capacity

region containing the region of the original GBC (and therefore throughput not smaller

thanR ). Inorder totighten thecooperativethroughputbound,wecan choose theworst-

case (cooperative throughput-wise) channelin this family. By usingthe generalcapacity

formulafor the cooperativethroughput [3], we have:

Lemma 1. For any channel matrix H,

R max

x min

z log

det H

x H

H

+

z

det

z

(24)

where minimizationis over allnoise covariances

z

satisfying the sub-unitdiagonalcon-

straintand maximizationis overallinput covariancematrices

x

satisfyingtrace(

x )=

A.

From Theorem 3and Lemma1 we can prove the following

Theorem 4. If Hhas full row-rank, then

lim

A!1

(R R rki max

)=0 (25)

and

lim

A!0 R =R

rki max

=1 (26)

Proof. ItisclearthatR rki max

andR coop

arealowerandanupperboundonR . Therst

statement follows directly from the rst part of Theorem 3, since R rki max

R R

coop

and lim

A!1 (R

coop

R rki max

)=0imply the statement.

Inorder toprovethe second statement,let bethe rr permutationmatrix which

sorts the rows of H such that jh 1

j jh r

j, and consider the QR decomposition

H=GQ. We apply Lemma 1by choosing as noise covariance

z

=GD 2

G H

, where

D =diag(jh 1

j;:::;jh r

j). By construction,

z

is positivedenite (recall that Hhas rank

r)and satisesthe sub-unit diagonalconstraint,in fact,

[

z ]

k;k

= k

X

jg

k;j j

2

jh j

j 2

k

X

jg

k;j j

2

jh k

j 2

=1

(14)

With this choice, the RHS in(24) becomes

log

det GQ

x Q

H

G H

+GD 2

G H

det GD 2

G H

=log

det Q

x Q

H

+D 2

detD 2

From Hadamard inequality [26] we obtain the maximizing signal covariance in the form

x

=Q H

diag(a

1

;:::;a

r

)Q, whichyields the bound

R r

X

k=1

[log(a

k )]

+

(27)

r

k=1

[ 1=jh

k j

2

]

+

=A.

Ifjh 1

j>jh 2

j,thenthereexists avalue0<A

1

<1suchthat,forallAA

1

,theRHS

in (27) is equal tolog(1+jh 1

j 2

A)=R mrc

. This is clearly achievable by RKI and by ZF,

thereforeforA2[0;A

1

]theRKI(andZF)strategyisoptimal(notonlyasymptoticallyfor

A !0). If there exist >1rows with maximal 2-normjh max

j, then there exists a value

0<A

2

<1suchthat, forallA A

2

,the RHSin(27) is equaltolog(1+jh max

j 2

A=).

In this case,

lim

A!0

log(1+jh max

j 2

A=)

log(1+jh max

j 2

A)

=1

and the statementof Theorem 4 stillholds (but only inthe limitfor vanishing A).

Remark: downlinkstrategies. Theorem4showsaninterestingfeatureofthet1:r

GBC and of the RKI strategy, which might have a relevant impact on the design of the

downlinkofwirelesscommunicationsystems. Ifthebase-stationisstronglypowerlimited,

then thethroughput-maximizingstrategyconsistsofMRCbeamformingtothebestuser,

whichis the same optimalstrategy for the standard degraded GBC(t =1). Inthis case,

the transmit antenna array is used to enhance the received SNR of the best user but

does not expand the useful dimensions for transmission. Practical downlink protocols

for high-rate packet communications are currently proposed and implemented according

to this principle: only one user in each time-slotis served according to a channel-driven

schedulingallocatingthechanneltotheuserenjoyingtheinstantaneoushighestindividual

capacity [34, 35].

2

Onthe contrary,ifthe base-stationcan transmitatlargepower, thesame throughput

of a single-user multiple-antenna system can be approached even if the receivers cannot

cooperate. In particular, by lettingthe numberof served users per time-slotequaltothe

numberof transmitantennas, under mildconditionson the channel matrix statisticsthe

slope of the throughput as a function of SNR in dB is equal to t. Hence, in the GBC

setting, the \capacity boost" typical of multiple-antenna systems depends strongly on

the available transmit power. Notice also that the same throughput slopes for low and

2

Inpractice,theschedulermustmaximizethecellthroughputsubjecttosomefairnessconstraint[34,

35], thereforethechannelallocationrulediersfrom thesimple\bestuser" rule.

(15)

high SNR are obtained by the conventional ZF beamforming,although this is generally

asymptoticallysuboptimal for high SNR.

Assume that H has rank m r and, after a suitable row permutation , can be

partitioned as

H=

H

1

H

2

(28)

where H

1 2C

mt

has rank m and H

2 2C

(r m)t

. Any row of H

2

can be expressed as a

linear combinationof rows of H

1

, i.e., we can write

H

2

=BH

1

where B =H

2 H

+

1

. The noise vector is also partitioned as z =[z T

1

;z T

2 ]

T

, where

z1 and

z2

are the mm upper left and (r m)(r m) lower right diagonal blocks of

z

(both

z1

and

z2

must satisfy the sub-unit diagonal constraint). Then, we have the

following:

Lemma 2. If

z2

B

z1 B

H

is non-negative denite, then R R coop

1

, where R coop

1 is

the cooperativethroughput of the tm channely

1

=H

1 x+z

1 .

Proof. Dene the auxiliarychannel with input y

1

and output

y

2

=By

1 +

where N

C (0;

), independent of z

1

and with

=

z2

B

z1 B

H

, which by

assumption is a valid covariance matrix. Among all channels with assigned marginal

transitionpdfs, thereexists the \cascade" channel withjointtransitionpdfp(y

1

;y

2 jx)=

p 0

(y

2 jy

1 )p(y

1

jx), where

p(y

1

jx) = N

C (H

1 x;

z1 )

p 0

(y

2 jy

1

) = N

C (By

1

;

)

For the cascade channel,x!y

1

!y

2

is aMarkov chain and wehave

I(x;y

1

;y

2

)=I(x;y

1

)+I(x;y

2 jy

1

)=I(x;y

1 )

By maximizingI(x;y

1

) with respect to the input distribution (subject tothe inputcon-

straint),we get by denition max

x:trace(x)=A I(x;y

1 )=R

coop

1

.

Lemma 2 is actually \included" in Lemma 1, since R coop

1

is achieved by a particular

choice of

z

in the RHS of (24). However, it puts in evidence the interesting fact that,

under certainconditions onH, the RKI strategyis asymptoticallyoptimalfor high SNR

even if the channel has rankm <r. In particular, we havethe following

(16)

Corollary 2. Let H have rank m r and, after a suitable row permutation, assume

that

H=

H

1

BH

1

(29)

where H

1 2 C

mt

has rank m and kBk 2

1. Then, the RKI scheme is asymptotically

optimal forthe t1:r GBC inthe limitfor large SNR,i.e., lim

A!1

(R R rki max

)=0.

Proof. We apply Lemma 2 with the choice

z1

= I

m

and

z2

= I

r m

. If kBk

2 1,

then I

r m

BB H

is non-negative denite and, by Lemma 2, R R coop

1

, the coopera-

tive throughput of the tm subchannel dened by H

1

. Let R rki max

1

be the maximum

RKI throughput for this subchannel. Since H

1

is full row-rank, by Theorem 3 we have

that lim

A!1 (R

coop

1

R rki max

1

) = 0. Since R rki max

1

R

rki max

R , this implies that

lim

A!1

(R R rki max

)=0, asdesired.

The decomposition (29) (if it exists) is essentially unique, as stated by the following

result in linearalgebra (new up to the authors' knowledge):

Lemma 3. Let H2C rt

have rankm and assume that, aftera suitable rowpermuta-

tion, it can be decomposed into the submatrices H

1 2 C

mt

of rank m and H

2

= BH

1

as in(29), with kBk

2

1. Then, this decomposition isunique , inthe sense that forany

H 0

1 2C

mt

and H 0

2

obtained by exchanging some rows of H

1

with some rows of H

2 such

that rank (H 0

1

)=m, we havekH 0

2 (H

0

1 )

+

k

2 1.

Proof. See Appendix C.

Establishing if a given matrix H admits the decomposition (29) is a problem of

independent interest which can be formulated as follows: given a set of vectors S =

fv

1

;:::;v

r g in C

t

, spanning an m minfr;tg dimensional subspace, nd (if it exists)

a set S 0

of m linearly independent vectors such that all other vectors in S can be writ-

ten as linear combinationsof the vectors in S 0

and such that the matrix of combination

coeÆcients has 2-normnot larger than 1.

The following simple example shows that there exist matrices for which decomposi-

tion (29) is not possible. For these channels, we cannot claim that the RKI strategy is

asymptoticallyoptimal.

Example. LetH2C 3t

ofrank2,andassumethath 3

=

1 h

1

+

2 h

2

. Ifj

1 j

2

+j

2 j

2

>1

and j

2 j

2

<j

1 j

2

<1+j

2 j

2

orj

1 j

2

<j

2 j

2

<1+j

1 j

2

,itisnot possibletoexpressanyof

the rows h 1

, h 2

and h 3

as a linear combination of the other two with coeÆcients 0

and

00

suchthat j 0

j 2

+j 00

j 2

1. The set of coeÆcients

1

;

2

satisfyingthe abovecondition

is clearly non-empty,so, such matrices exist.

Modied RKIstrategy. In ordertotighten theRKI throughputlowerbound wecan

consider the following modied RKI strategy. Let H = GQ and construct the trans-

mitted signal as x = Q H

Ru where R is a mm upper triangular matrix satisfying

(17)

trace(RR H

)Aand whereE[u

i u

H

i ]=I

m

. Thisyieldsthe setofm interference channels

y

k;i

=w

k;k u

k;i +

X

j<k w

k;j u

j;i +

X

j>k w

k;j u

j;i +z

k;i

; k =1;:::;m (30)

where W denotes the upper mm block of GR, and where no information is sent to

users k=m+1;:::;r. Theencoderconsiders theinterference signal P

j<k w

k;j u

j;i

caused

by users j <k as known non-causally and the k-thdecoder treats the interference signal

P

j>k w

k;j u

j;i

caused by users j >k as additionalnoise. Byapplying a coding for known

interference strategy (e.g., the lattice precoding scheme of [17]) and by using minimum

Euclidean distance decoding at each k-th receiver, from the results of Appendix B and

the resultsof [36] it is not hard toshow that the followingthroughput is achievable

R mod rki

= m

X

k=1

log 1+

jw

k;k j

2

1+ P

j>k jw

k;j j

2

!

(31)

This can be further maximized over all matrices R satisfying the trace constraint and

overallordered user subsets Sofsize m. The maximum throughputofthe modied RKI

strategy provides alowerbound toR generallytighterthan the basicRKI strategy,since

the former reduces to the latter by constraining R tobe diagonal.

5 The optimal throughput of the t 1 : 2 GBC

The simplestnon-trivial(i.e., non-degraded)GBCwithmultipletransmitantennasisthe

two-user case. Forthis channel,we havethe following closed-formresult:

Theorem 5. The maximum achievable throughput of the t1:2GBC is given by

R = (

log(1+jh 1

j 2

A) A A

1

log

(Adet (HH H

)+trace(HH H

)) 2

4jh 2

(h 1

) H

j 2

4det (HH H

)

A>A

1

(32)

where withoutloss of generality we assumejh 1

j jh 2

jand where

A

1

= jh

1

j 2

jh 2

j 2

det(HH H

)

Proof. Assumejh 1

jjh 2

j. Thecaseofrank(H)=1istrivial,sinceinthiscasethetwo

rows of Harelinearlydependent, then,the t1:2GBCreduces toastandarddegraded

GBC with input x=h 1

xand outputs

y

1

=x+z

1

; y

2

=(jh 2

j=jh 1

j)x+z

2

The throughput is clearly maximized by transmitting to the best user only [27], i.e., to

1 2

(18)

since in this case A

1

=+1and onlythe rst linein(32) is relevant, therefore,Theorem

5 holds inthe rank1 case.

Next,weconsiderthecaseofHofrank2. WeshalluseLemma1tondanupperbound

andthemodiedRKIstrategytondalowerbound,andshowthattheseboundscoincide

with (32).

Cooperative throughput upperbound. A looser version of Lemma 1 yields the upper-

bound

R min

z 2U

max

x 2A

log

det H

x H

H

+

z

det

z

(33)

where A is the set of all tt non-negative denite covariance matrices with trace A,

and U is the set of all positive denite covariance matrices satisfying the unit-diagonal

constraint (diagonal elements strictly equal to 1). Consider rst the maximization of

mutualinformationwithrespect to

x

foragiven

z

. Byletting

z

=U

z U

H

,with U

unitary and

z

diagonal, weobtain the equivalent problem

max

x2A

logdet H

z

x H

H

z +I

where H

z

= 1=2

z U

H

H. More explicitly,since

z

=

1

we obtain

U= 1

p

2

1 1

;

z

=

1+r 0

0 1 r

where we letr =jj and =exp ( j\).

Let

1 and

2

denotethe eigenvalues ofH

z H

H

z

. Then,the maximizationwithrespect

to

x

2A yieldsthe function of r; and A

f(r;;A)= r

X

i=1

[log(

i )]

+

(34)

where solvesthe equation

2

X

i=1

1

i

+

=A (35)

Explicitly,we have

H

z H

H

z

= 1

2

"

h +

+2

1+r

h +j2!

p

1 r 2

h j2!

p

1 r 2

h +

2

1 r

#

where we dene h +

= jh 1

j 2

+jh 2

j 2

, h = jh 1

j 2

jh 2

j 2

and h 2

(h 1

) H

= +j!. The

eigenvalues

1;2

are given by

1;2

= 1

T p

T 2

4D

(19)

withT =trace(H

z H

H

z

)andD=det (H

z H

H

z

). ItisimmediatetocheckthatDisindepen-

dent of , while T is adecreasing function of . Weconclude that minimizingcapacity

isgiven by

?

=exp ( j\h 2

(h 1

) H

),sothat =jh 2

(h 1

) H

jismaximum. Withthischoice,

we have

T = h

+

2rjh 2

(h 1

) H

j

1 r 2

; D= jh

1

j 2

jh 2

j 2

jh 2

(h 1

) H

j 2

1 r 2

In order toobtain the eigenvalues ina convenient form,it isuseful torepresent the rows

h 1

and h 2

inanorthonormalbasis. Applying Gram-Schmidtorthogonalizationtoh 1

and

h 2

(in the order) we obtain h 1

=g

1;1 q

1

, h 2

=g

2;1 q

1

+g

2;2 q

2

, and where q 1

and q 2

(the

rows of Q) are orthonormal. The explicit expression of the eigenvalues is nowgiven by

1;2

=

1

2(1 r 2

)

jg

1;1 j

2

+jg

2;1 j

2

+jg

2;2 j

2

2r

(jg

1;1 j

2

jg

2;1 j

2

jg

2;2 j

2

) 2

4r(jg

1;1 j

2

+jg

2;1 j

2

+jg

2;2 j

2

)

+ 4r 2

jg

1;1 j

2

(jg

2;1 j

2

+jg

2;2 j

2

)+4 2

1=2 i

(36)

Next,wehavetominimizethemaximum mutualinformationdened by (34)and by(35)

with respect to the noise correlation parameter r 2[0;1) (recall that minimizationwith

respect to is already achieved). We partition the SNR range [0;1) into two intervals,

called infollowing thelow-SNR and the high-SNRregions, anddened by therange ofA

for which 1=

2

or >1=

2

, respectively (notice that

1

2

holds for any channel

matrixand valueofr). Then,weconsider separatelytheminimizationoftheupperbound

on the two regions.

Low SNR region. Letr =r

?

=jg

2;1

=g

1;1

j==jg

1;1 j

2

. Then, weobtain

1

=jg

1;1 j

2

;

2

= jg

1;1 g

2;2 j

2

jg

1;1 j

2

jg

2;1 j

2

For

0AA

1

= jg

1;1 j

2

jg

2;1 j

2

jg

2;2 j

2

jg

1;1 g

2;2 j

2

= jh

1

j 2

jh 2

j 2

det(HH H

)

(37)

the resulting mutual information is log(1+jg

1;1 j

2

A), which coincides with the rst line

of (32). This is achievable under the broadcast channel constraint (non-cooperative re-

ceivers) by MRC beamformingto user 1 only (the best user), and therefore it is clearly

a tight upper bound. For jh 1

j 2

= jh 2

j 2

we have A

1

= 0 and under this condition the

low-SNR case isirrelevant.

High SNR region. Inthis case, (34) and (35) become

f(r;

?

;A) = log(

1

)+log(

2 )

=

1

2

A+ 1

1 +

1

2

(38)

(20)

By substituting weobtain

f(r;

?

;A) = 2log

p

1

2

A+

1 +

2

1

2

2log2

= 2log

p

D(A+T=D)

2log2 (39)

Trace and determinant T and D are writtenin termsof the g

i;j 's as

T = jg

1;1 j

2

+jg

2;1 j

2

+jg

2;2 j

2

2r

1 r 2

; D= jg

1;1 g

2;2 j

2

1 r 2

Bydirectsubstitutionoftheaboveexpressionsinto(39)andbydierentiatingwithrespect

to r we obtaina stationary pointin

r

?

=

2jg

1;1 g

2;1 j

Ajg

1;1 g

2;2 j

2

+jg

1;1 j

2

+jg

2;1 j

2

+jg

2;2 j

2

=

2jh 2

(h 1

) H

j

Adet (HH H

)+trace(HH H

)

Noticethatr

?

isadecreasingfunctionofA,andlim

A!1 r

?

=0. Therefore,theworst-case

noise in the large-SNR case is white. Also, for A = A

1

we obtain r

?

= jg

2;1

=g

1;1 j 1.

Then, the solutionr

?

iscompatiblewith the positive denitenessconstrainton

z

forall

nite A. Finally, we observe that the worst-case noise correlation r

?

is continuous in A

for all A 0, since the limitsof r

?

for A !A

1

from the left and from the right are the

same. Eventually, by substituting r

?

found above into (38) we obtain the second lineof

(32).

ModiedRKIthroughputlowerbound. ThethroughputachievablebythemodiedRKI

strategy is given by

R mod rki

=log

1+ jg

1;1 r

1;1 j

2

a

1

1+jg

1;1 r

1;2 j

2

a

2

+log 1+jg

2;1 r

1;2 +g

2;2 r

2;2 j

2

a

2

(40)

where we let

R=

p

a

1 r

1;1 p

a

2 r

1;2

0

p

a

2 r

2;2

subject to the constraint trace(RR H

)=A, whichis writtenexplicitlyas

jr

1;1 j

2

a

1 +(jr

1;2 j

2

+jr

2;2 j

2

)a

2

=A

This must be maximized with respect to a

1

;a

2

and the coeÆcients r

1;1

;r

1;2

;r

2;2 . To

this purpose, we reparameterize the problem by letting b = jg

2;1

=g

2;2

j, z = jr

1;2

=r

2;2 j,

q =z 2

=(1+z 2

) and p = (bz+1) 2

=(1+z 2

), X

1

= jr

1;1 j

2

a

1

and X

2

=(jr

1;2 j

2

+jr

2;2 j

2

)a

2 .

Then, we obtain

R mod rki

=log

1+ jg

1;1 j

2

X

1

1+jg

1;1 j

2

qX

2

+log(1+jg

2;2 j

2

pX

2

) (41)

(21)

with the constraint X

1 +X

2

= A. For z =0, the modied RKI strategy reduces to the

standardRKIstrategyandforX

1

=AandX

2

=0itachieves(32)inthelow-SNRregion.

Therefore, weshall consider onlythe high-SNRregion AA

1 .

By dividing all elements of G by jg

1;1

j and by replacing the input constraint A by

a

=Ajg

1;1 j

2

weobtain the equivalent problem

R mod rki

=log

1+ x

1

1+qx

2

+log(1+px

2

) (42)

where x

1 +x

2

= a and where, by letting z = p

q=(1 q) and b = p

= with =

jg

2;1

=g

1;1 j

2

and =jg

2;2

=g

1;1 j

2

, we have the relation

p=

p

q+ p

(1 q)

2

The high-SNR condition AA

1

translates intothe condition a (1 )=.

For any xed q 2 [0;1], we let x

1

= a x and x

2

= x in (42) and maximize the

resultingexpression forx2[0;a]. Byletting

@

@x R

mod rki

=0and makingthe substitution

y =1+qx, we obtainthe solution

y= s

(1+aq)(p q)

p(1 q)

(43)

which isvalid if pq and y1. The rst condition yieldsthe inequality

p

q+ p

(1 q)

2

q

which implies

0qq

max

=

(1 p

) 2

+ 1

The second condition yields the inequality

p

q+ p

(1 q)

2

1+aq

1+a

By lettingq =z 2

=(1+z 2

)this isturned intothe secondorder inequality

( 1)z 2

+2 p

z+ 1=(1+a)0

which impliesz 2[z

1

;z

2

], where

z

1;2

= p

q

1

1+a

1

(22)

Fora(1 )=itiseasytocheck that (1 )=(1+a)0,thereforethe above

solution always exists in the high-SNR region. It is easy tocheck that

0q

1

= z

2

1

1+z 2

1 q

2

= z

2

1+z 2

2 q

max

Therefore, the solution(43)isvalidintheintervalq2[q

1

;q

2

]andthe maximumthrough-

put canbeobtained byrst substituting(43)into(42)and thenmaximizingwithrespect

to q2[q

1

;q

2

]. After substitution, the function tobe maximizedis

2log

"

1

q q

( p

q+ p

(1 q)) 2

(1+aq) r

( p

q+ p

(1 q)) 2

q

(1 q)

!#

(44)

Bysubstitutingagain q=z 2

=(1+z 2

)intothe aboveexpression andlettingthe derivative

with respecttoz equal tozerowe obtaina5-th orderequation,whoserootscan be given

(quite fortuitously!) inclosed form asz

1

;z

2

given above, z

3;4

= p

= and

z

5

=

2 p

(a+1)+1

It can bechecked that z

5 2[z

1

;z

2

], and therefore itis the sought maximum.

Finally, by substituting the resulting

q

?

= z

2

5

1+z 2

5

=

4

((a+1)+1 ) 2

+4

into(44), we obtainthe maximumthroughput as

R mod rki

=log

(a+1++) 2

4

(45)

which coincides with the second lineof (32). This concludes the proof.

Example: average throughput inRayleighfading. WeuseTheorem5tocompute

the maximum average throughput

R of the composite channel when H has i.i.d. entries

N

C

(0;1)(independent Rayleighfading). Subjecttothe short-termconstraint,wehave

simply

R (A) =E

H

[R (A)] where R (A) is given by (32) and where we put in evidence its

dependence on the input constraint A. The maximum average throughput subject to a

long-termconstraint isobtained by solving

max E

H [R (a)]

subject to E

H

[a]=A; a 0

(46)

ByusingthestandardLagrange-Kuhn-Tuckertechnique[32],aftersomealgebraweobtain

the optimaltransmitpowerallocationinthe form

a(H)= 8

<

:

[ 1=jh 1

j 2

]

+

for jh

1

j 4

jh 2

(h 1

) H

j 2

jh 1

j 2

det (HH H

)

+ q

2

+ 4jh

2

(h 1

) H

j 2

H 2

trace (HH H

)

H

otherwise

(47)

(23)

where satisesE

H

[a(H)]=A. The condition jh

1

j 4

jh 2

(h 1

) H

j 2

jh 1

j 2

det (HH H

)

isequivalenttoa(H)

A

1

, whereA

1

isgiven inTheorem5. Hence,by substituting(47) inthe placeofAin(32),

we obtainexplicitlythe optimalthroughputsubject tothe long-termpowerconstraintas

R (A) =E

H [f

(H)] where

f

(H)=

8

<

:

[log (jh 1

j 2

)]

+

for jh

1

j 4

jh 2

(h 1

) H

j 2

jh 1

j 2

det (HH H

)

log h

det (HH H

)

+ q

2

+ 4jh

2

(h 1

) H

j 2

det(HH H

) 2

i

log2 otherwise

(48)

Figs. 1 and 2 show

R in the case t = r =2 for the short and the long-termconstraints,

respectively, vs. E

b

=N

0

. Forthe sake of comparison, we show alsothe 22 cooperative

and ZFthroughputs.

In this work, E

b

=N

0

forthe t1:r GBC isdened as[37]

E

b

N

0

= tA

R

(49)

the factor t in the numerator of (49) is introduced totake intoaccount that, undermild

conditions onH, the averagereceived energy perchannel use increases linearly with t for

MRCbeamformingtoanygivenuser (e.g.,ifHhasi.i.d. elementswithunit secondorder

moment the individual user channels have average gain t). For the cooperative system,

since tr and rt channelsyield the same throughput [3], we adopt the denition

E

b

N

0

=

maxfr;tgA

R

(50)

For the short-term constraint, there exists aminimum (E

b

=N

0 )

min

>0 below which

R is

zero.

3

Thiscan be calculatedby lettingA #0 in(49)and in(50). Forthe 21:2 GBC

(basic RKI, ZF and modied RKIschemes) we obtain

E

b

N

0

min

=

2log2

E[jh 1

j 2

]

=

2log2

(11=4)

= 2:97dB

where weused the fact that jh 1

j 2

isdistributed asthe maximum of two independentand

identically distributed central Chi-squared random variables with 4 degrees of freedom.

For the 22cooperative system we have

E

b

N

0

min

=

2log2

E[c

1 ]

=

2log2

(7=2)

= 4:02dB

wherewe usedthe factthatc

1

isthe maximumeigenvalueof the22Wishart matrix[3]

HH H

.

From Figs.1 and 2 it isclearly visible that the basic RKI scheme is optimalboth for

E

b

=N

0

#(E

b

=N

0 )

min

andforE

b

=N

0

!1. ForsmallSNR,itis(asymptotically)equivalent

3

For the long-term constraint (E

b

=N

0 )

min

= 0 since jh 1

j 2

has a distribution with unbounded sup-

(24)

0 2 4 6 8 10

-8 -6 -4 -2 0 2 4 6 8 10

R (bit/s/Hz)

E _b /N ₀ (dB)

Independent Rayleigh fading, short-term constraint Mod-RKI (optimal GBC)

RKI Zero-Forcing Cooperative

Figure1: Throughputof the 21:2 GBCwith independent Rayleighfadingand short-

term input constraint.

toZFand both strategiesreducetosimpleMRCbeamformingtothebest user. Forlarge

SNR, it is (asymptotically) equivalent to the cooperative single-user multiple-antenna

capacity. This is a consequence of the fact that for independent Rayleigh fading the

channel matrix H has full-rank almost surely, therefore Theorem 3 applies to almost all

realizations of the channel.

6 Performance of the basic RKI strategy with Rayleigh

fading

In this section we focus on the basic RKI scheme when the channel matrix has i.i.d.

entries N

C

(0;1),and weconsiderthe throughputofthecompositechannelsubjecttoa

long-termpowerconstraint. Weconsider boththenitedimensionalandthelarge-system

limit,i.e., whenr;t !1 whilethe ratior=tusers/transmitantennaconvergesto agiven

constant (referred to in the following as antenna loading). Moreover, we assume that

no eort is made to optimize the user ordering. In other words, the receiver works with

the natural ordering of the rows of H and considers the coeÆcients fd

i

: i = 1;:::;mg

corresponding to the QRdecomposition H=GQ.

It interesting to notice that since the composite channel is symmetric with respect

to any user by time-sharing with uniform probability over all possible user subsets and

(25)

0 2 4 6 8 10

-8 -6 -4 -2 0 2 4 6 8 10

R (bit/s/Hz)

E _b /N ₀ (dB)

Independent Rayleigh fading, long-term constraint Mod-RKI (optimal GBC)

RKI Zero-Forcing Cooperative

Figure 2: Throughput of the 21: 2GBC with independent Rayleigh fadingand long-

term input constraint.

orderings, every user inthesystem achieves the sameaverage per-userrate

=

R =r with

no lossof optimality inthe total throughput.

We make use of the followingresults [39, 40, 41, 42]:

Lemma 4. LetH2C rt

have i.i.d. entries N

C

(0;1),and letg

i;i

bethei-th diagonal

elementof Ginthe QRdecompositionH=GQ. Then, the randomvariablesd

i

=jg

i;i j

2

arestatisticallyindependentandd

i

2

2(t i+1)

,where 2

2k

denotesthecentralChi-squared

distribution with 2k degrees of freedom, whosepdf isf(z)=z k 1

e z

=(k 1)!.

Lemma 5. Let = i=r 2 [0;1] denote the normalized user index. If H 2 C rt

(for

r=t = ) has i.i.d. circularly-symmetric elements with mean 0, variance 1 and bounded

fourth moment,then

lim

r!1 1

t d

i

=[1 ]

+

(51)

with probability 1.

Inthecaseofniter;t,Corollary1yieldssolutionof P

m

i=1 E

h

[ 1

di ]

+ i

=A,where,

(26)

from Lemma4, we have

m

X

i=1 E

1

d

i

+

= m

X

i=1 Z

1

1=

1

z

z t i

e z

(t i)!

dz

= m

X

i=1 1

(t i)!

[ (t i+1;1=) (t i;1=)] (52)

where (n;x)

= R

1

x z

n 1

e z

dz, and where (0;x) = E

i

(1;x) and, for integer n 1,

(n;x)=(n 1)!e x

P

n 1

j=0 x

j

=j!.

The resultingRKI average throughputis given by

R rki

= m

X

i=1 E

[log(d

i )]

+

= m

X

i=1 1

t i+1

Z

1

log(z) z

t i

e z=

(t i)!

dz

= m

X

i=1 J

t i+1

() (53)

where we let[43]

J

k (a)

= a k

Z

1

log(z) z

k 1

e z=a

(k 1)!

dz

= E

i

(1;1=a)+ k 1

X

`=1 1

` e

1=a

` 1

X

j=0 a

j

j!

(54)

WiththesameassumptionsonthestatisticsofH,theelementsb

i

intheZFthroughput

formula (14) are identically distributed 2

2(t k+1)

, for any subset of active users of

cardinality k m. Here, the cardinality k of the active user set can be chosen in order

to optimizefurther the ZF throughput. By replicating the calculation done above inthe

case ofk activeusers weobtaintheZF throughputsubjecttothe long-termconstraintas

R zf

= max

k=1;:::;m kJ

t k+1 (

k

) (55)

where

k

is the solution of the waterllingequation

k

(t k)!

[ (t k+1;1=) (t k;1=)]=A

analogous to (52).

The throughput of the cooperative system can be obtained from the pdf of a single

eigenvalue of the mm Wishart matrix

W=

HH H

for rt

H

(56)

On the achievable throughput of a multi-antenna Gaussian broadcast channel

0 2 4 6 8 10

-8 -6 -4 -2 0 2 4 6 8 10

R (bit/s/Hz)

E b /N 0 (dB)

Independent Rayleigh fading, short-term constraint Mod-RKI (optimal GBC)

RKI Zero-Forcing Cooperative

0 2 4 6 8 10

-8 -6 -4 -2 0 2 4 6 8 10

R (bit/s/Hz)

E b /N 0 (dB)

Independent Rayleigh fading, long-term constraint Mod-RKI (optimal GBC)

RKI Zero-Forcing Cooperative

E _b /N ₀ (dB)

E _b /N ₀ (dB)