• Aucun résultat trouvé

Semiblind channel estimation for MIMO spatial multiplexing systems

N/A
N/A
Protected

Academic year: 2022

Partager "Semiblind channel estimation for MIMO spatial multiplexing systems"

Copied!
5
0
0

Texte intégral

(1)

Semiblind Channel Estimation for MIMO Spatial Multiplexing Systems

Abdelkader Medles Dirk T.M. Slock Institut Eur´ecom

, 2229 route des Crˆetes, B.P. 193,

06904 Sophia Antipolis Cedex, FRANCE Tel: +33 4 9300 2910/2606; Fax: +33 4 9300 2627

e-mail:

f

medles,slock

g

@eurecom.fr

Abstract

In this paper, we analyze the semiblind mutual informa- tion (MI) between the input and output of a MIMO channel.

To that end we consider the popular block fading model.

We assume that some training/pilot symbols get inserted at the beginning of each burst. We show that the average MI over a transmission burst can be decomposed into symbol position dependent contributions. The MI component at a certain symbol position optimally combines semiblind in- formation up to that symbol position (with perfect input re- covery up to that position) with blind information from the rest of the burst. We also analyze the asymptotic regime for which we can formulate optimal channel estimates and evaluate the capacity loss with respect to the known chan- nel case. Asymptotically, the decrease in MI involves Fisher Information matrices for certain channel estimation prob- lems. We also suggest to exploit correlations in the channel model to improve estimation performance and minimize ca- pacity loss.

1: Introduction

We consider single user spatial multiplexing systems over flat fading MIMO channels. We furthermore consider the usual block fading model in which data gets transmitted over a number of bursts such that the channel is constant over a burst but fading independently between bursts. The transmitter is assumed to have no channel knowledge. The formidable capacity increase realizable with such dual an- tenna array systems has been shown [5],[2] to be propor- tional to the minimum of the antenna array dimensions for channel with i.i.d. fading entries. At least, this is the case when the receiver has perfect channel knowledge. This con- dition is fairly straightforward to approach in SISO systems by inserting pilot/training data in the transmission, with ac- ceptable capacity decrease [1]. For MIMO systems of large dimensions though, the training overhead for good channel

Eur´ecom’s research is partially supported by its industrial partners:

Ascom, Swisscom, Thales Communications, ST Microelectronics, CEGE- TEL, Motorola, France T´el´ecom, Bouygues Telecom, Hitachi Europe Ltd.

and Texas Instruments. The work reported herein was also partially sup- ported by the French RNRT project ERMITAGES.

estimation quality becomes far from negligible though, es- pecially for higher Doppler speeds such as in mobile com- munications. The effect of channel estimation errors on the MI has been analyzed in [4], whereas optimal design of training based channel estimation has been addressed in [6]. The true capacity of the system is higher though than the MI obtained by training based channel estimates [2],[3].

In this paper, we attempt to approach the true capacity by optimal semiblind channel estimation that is suggested by a decomposition of the MI. Related background material ap- peared in [7] where a variety of semiblind MIMO channel estimation techniques were introduced.

2: Capacity Decomposition

We shall consider here the usual block fading model, except that we shall refer to a block as burst; consider then trans- mission over a MIMO flat fading channel

y = Hx + v

, for

a particular burst of

T

symbol periods. The accumulated received signal over the burst is then:

Y = (I

T

H) X + V

(1)

where

Y

and

V

are

N

r

T

1

and

X

is

N

t

T

1

.

H

is

a

N

r

N

tchannel matrix.

N

t(resp.

N

r) is the number of transmit (resp. receive) antennas. We assume the use of a pilot training sequence of length

N

t

T

p, the length of the transmitted data is then

N

t

T

d so that

T

p

+ T

d

= T

. We

can decompose the burst signal into training and data parts

X = (X

pT

X

dT

)

T

Y = (Y

pT

Y

dT

)

T and

V = (V

pT

V

dT

)

T.

As stated in [6], the mutual information between the input and the output of this system verifies:

I(Y

p

Y

d

X

dj

X

p

) = I(Y

d

X

dj

X

p

Y

p

) + I(Y

p

X

pj

X

d

)

= I(Y

d

X

dj

X

p

Y

p

)

I(Y

p

X

p

X

d

) = 0

due to the independence between

X

(2)d and

(Y

p

X

p

)

. Consider now a partition of

X

d in

Q

blocks

X

i

i = 1:::Q

,

X

d

= (X

1T

:::X

QT

)

T, of different lengths

T

i

i = 1:::Q

Pi

T

i

= T

d.

X

i and

V

i

are assumed independent from block to block (block-wise coding across bursts). Let’s define for,

j

i

,

X

ij

= (X

iT

X

i+1T

:::X

jT

)

T and

Y

ij

= (Y

iT

X

i+1T

:::Y

jT

)

T.

Then,

(2)

I(Y

d

X

dj

X

p

Y

p

) = I(Y

1Q

X

1

X

2Qj

X

p

Y

p

)

= I(Y

1Q

X

1j

X

p

Y

p

) + I(Y

1Q

X

2Qj

X

p

Y

p

X

1

)

= I(Y

1Q

X

1j

X

p

Y

p

) + I(Y

2Q

X

2Qj

X

p

Y

p

X

1

Y

1

) +I(Y

1

X

2Qj

X

p

Y

p

X

1

)

| {z }

=0

(3) where

I(Y

1

X

2Qj

X

p

Y

p

X

1

) = h(Y

1j

X

p

Y

p

X

1

)

;

h(Y

1j

X

p

Y

p

X

1

X

2Q

)

, and considering the fact that

X

2Qis

independent of

Y

1conditioned on

(X

p

Y

p

X

1

)

, this leads to

h(Y

1j

X

p

Y

p

X

1

X

2Q

) = h(Y

1j

X

p

Y

p

X

1

)

and finally

I(Y

1

X

2Qj

X

p

Y

p

X

1

) = 0

. Iterating the equation for

i = 2:::Q

, this leads to the following:

I(Y

d

X

dj

X

p

Y

p

) =

PQi=1

I(Y

iQ

X

ij

X

p

Y

p

X

1i;1

Y

1i;1

)

=

XQ

i=1

I(Y

i

X

ij

X

p

Y

p

X

1i;1

Y

1i;1

)

| {z }

I1

+

Q;1X

i=1

I(Y

i+1Q

X

ij

X

p

Y

p

X

1i;1

Y

1i

)

| {z }

I2

This clearly shows the way of processing to achieve capac- ity: for every block we use the already detected blocks as a (data-aided (DA)) training sequence (in addition to the ac- tual training) and use the not yet detected blocks as blind information. Also, the mutual information can be seen as the sum of two parts:

I(Y

d

X

dj

X

p

Y

p

) = I1 + I2

where:

I1

: is the rate that we can achieve by using only the already treated blocks for side information (DA training).

I2

: is the additional amount of rate that can be achieved by exploiting the blind information contained in the not yet de- tected blocks.

The average mutual information is defined as

I

av g

=

1

T

I(Y

d

X

dj

X

p

Y

p

)

. We want to show below that this quan- tity goes to the coherent MI

I(yx

j

H)

as

T

and

Q

grow

for finite

T

p, where

I(yx

j

H) = lim

T!1 1

T

I(Y X

j

H) = lim

i!1 1

T

i

I(Y

i

X

ij

H)

. An upper bound is given by:

I

av g

=

T1

(h(X

d

)

;

h(X

dj

X

p

Y

p

Y

d

))

1

T

(h(X

d

)

;

h(X

dj

X

p

Y

p

Y

d

H))

=

T1

(h(X

d

)

;

h(X

dj

Y

d

H))

=

T1

(I(Y X

j

H)

;

I(Y

p

X

pj

H))

)

lim

T!1

I

av g

I(yx

j

H)

(4) Also

I(Y

iQ

X

ij

X

p

Y

p

X

1i;1

Y

1i;1

)

= h(X

i

)

;

h(X

ij

X

p

Y

p

X

1i;1

Y

1i;1

Y

i+1Q

Y

i

)

h(X

i

)

;

h(X

ij

Y

i

H(X

b p

Y

p

X

1i;1

Y

1i;1

Y

i+1Q

Y

i

))

= h(X

i

)

;

h(X

ij

Y

i

H(X

b p

X

1i;1

Y

p

Y

1i;1

Y

i+1Q

| {z }

=Y

i

))

= I(Y

i

X

ij

H

b(i)

)

(5) where

H

b(i)

= H(X

b p

X

1i;1

Y

i

)

is the optimal es- timate of the channel (statistic of reduced dimension),

that verifies

lim

i!1

H

b(i)

= H a:s

. Given that,

lim

i!1 Ni1

I(Y

iQ

X

ij

X

p

Y

p

X

1i;1

Y

1i;1

)

I(yx

j

H)

.

This allows us to conclude that

lim

T!1

I

av g

I(yx

j

H)

. Combining this result with (4) we conclude that:

lim

T!1T1

I(Y

d

X

dj

X

p

Y

p

) = I(yx

j

H) :

This

means that when

T

grows, adopting the detection method per block, and the associated channel estimation allows to reach the capacity of the system assuming perfect channel knowledge (see [3] for related results at high SNR).

Remark1: When

T

grows, the use of detected blocks only to estimate the channel allows to achieve asymptotically the MI

I

av g of the system. But for finite

T

, it is necessary to also use the blind information to reach it.

Remark2: For a fixed

T

, and when all the entries of

X

(training and data) are iid, it’s easy to see that the average MI

I

av g of the system is maximized when the number of the training symbols

T

pis as small as possible (i.e. allows semiblind identifiability of the channel).

Remark3: In the case where H is no longer constant, and varies from block to block

H = H

(i), the (coherent) ca- pacity of the system assuming perfect channel knowledge is no longer achievable for large

T

and the average MI is bounded by:

T 1

Q

X

i=1

I(Y

i

X

ij

H

bi

)

I

av g

1 T

Q

X

i=1

I(Y

i

X

ij

H

i

)

(6)

for any channel estimate

H

bi

= H

bi

(X

p

X

1i;1

Y

i

)

.

3: Channel estimation

3.1 Bayesian case (random channel with prior) The capacity

C

of the system is the maximum MI

I

av g

over all input distributions, under a given power constraint.

We have

1

T P

Q

i=1

I(Y

i

X

ij

H

b(i)

)

C

max

p(Xd)tr (RX

d

)TdNt 2

x

T 1

Q

X

i=1

I(Y

i

X

ij

H) :

(7)

This is valid for all choices of the partitioning of

X

d into

X

i

i = 1:::Q

, and in particular for

Q = T

dand

T

i

1

.

For an AWGN

V

with power

v2and in the absence of side information on the channel at the transmitter (see [5]), the max in the upper bound of the capacity (coherent capacity) is attained for a centered white Gaussian input with covari- ance

R

Xd

=

x2

I

NtTd.

T 1

T

d

X

i=1

I(Y

i

X

ij

H

b(i)

)

C

T

;

T

p

T E lndet(I+

2x

2v

H H

H

)

(8) where

H

b(i)

= H(X

b p

X

1i;1

Y

i

)

.

The received signal is

Y

i

= H X

i

+ V

i

= H

b(i)

X

i

+

H

e(i)

X

i

+ V

i

= H

b(i)

X

i

+ V

i

+ Z

i, where we as- sume

H

b(i) to satisfy the Pythagorean Theorem (PT) (i.e.

(3)

H

b is decorrelated with

H

e). Now

I(Y

i

X

ij

H

b(i)

) = h(Y

ij

H

b(i)

)

;

h(Y

ij

X

i

H

b(i)

) = h(Y

ij

H

b(i)

)

;

h(V

i

+ Z

ij

X

i

H

b(i)

)

. Under the above conditions and for uncor- related and centered Gaussian

V

i and

X

i

= X

iG (vari-

able with same 1st and 2nd order moments, but Gaus- sian), it was shown in [6] that a lower bound for

I(Y

i

X

ij

H

b(i)

)

is given by considering

Z

i as an indepen- dent and white Gaussian noise, with covariance

2z

I

, where

2z

=

Nr1

tr

E

(Z

i

Z

iH

) =

2xtrE(HeNr(i)He(i)H)

= N

t

2x

H2e

(i)

and

2He

(i)

=

trE(HNre(i)NtHe(i)H). The new lower bound is now:

C

T1PTdi=1

I(Y

i

X

ij

H

b(i)

)

T;T

p

T P

T

d

i=1 E

lndet(I +

2v x2 +Nt 2

x

2

f

H (i)

H

b(i)

H

b(i)H

)

(9) Let

2Hb

(i)

=

trE(HNrb(i)NtHb(i)H)and

H

(i)

=

Hbc(i) H

(i)

, then due to the fact that the channel estimator satisfies the Pythagorean Theorem, we have that

2Hb

(i)

+

2He

(i)

=

2H. Now:

C

T;TpT PTdi=1 E

lndet(I +

x22(2H;H2f(i))

v +N

t

2

x

2

f

H (i)

H

(i)

H

(i)H

)

= C

LB

(10) The expectation is over the distribution of

H

(i), which re- mains close to that of

H

(i). Then the given capacity lower bound

C

LB depends primarily on the Mean Square Error (MSE) of the channel estimator

H

b(i). Since

C

LB is a de-

creasing function of the MSE, the optimum estimator is the Minimum Mean Square Error(MMSE) estimator:

H

bMMSE(i)

= H

bMMSE(i)

(X

p

X

1i;1

Y

i

)

=

E

(H

j

X

p

X

1i;1

Y

i

)

(11)

which is an unbiased estimator of

H

. The performance of any unbiased estimator is bounded by the Cramer-Rao lower bound:

R

he(i)he(i)

=

Eeh(i)eh(i)T

J

;(i) (12)

where h

= Re(vect(H))

T

Im(vect(H))

T

]

T and

J

(i)is

the Bayesian Fischer Information Matrix (FIM) for the a posteriori distribution of

H

, and is in this case:

J

(i)

=

;E@@h

@lnp(HjX

p X

i;1

1 Y

i )

@h

T

=

;E

@

@

h

@ lnp(X

p

Y

p

X

1i;1

Y

1i;1j

H)

@

h

T

| {z }

J (i)

D Atr aining

;E

@

@

h

@ lnp(Y

i+1Tdj

H)

@

h

!

T

| {z }

J (i)

blind

;E

@

@

h

@ lnp(H)

@

h

T

| {z }

J (i)

pr ior

We have

2He (i)

trJ

;(i)

N

t N

r

. This is an absolute lower bound on the channel estimation MSE. The MMSE estima- tor achieves this bound asymptotically (

T

d ! 1). The

above result is also valid for the case when the channel varies from block to block, since channel re-estimation for every block is assumed.

3.2 Asymptotic Behavior

We focus here on the asymptotic behavior of the ca- pacity loss for small channel estimation error. We sup- pose that the channel is constant over every block, the mu- tual information for the particular block (i) at the receiver assuming channel estimation is then

I(Y

i

X

ij

H

b(i)

) =

E

(I

H(i)

(Y

i

X

ij

H

b(i)

))

, where

I

H(i)

(Y

i

X

ij

H

b(i)

)

is the ca-

pacity for a particular realization of the channel. It is as- sumed here that the channel

H

(i)may vary from block to block (while allowing an asymptotic regime). The MI as- suming perfect channel knowledge and for a particular re- alization is

I

H(i)

(Y

i

X

ij

H

(i)

)

. Let’s temporarily drop the block index

(i)

.

In the following, we derive a weighting matrix that appears in the asymptotic MI decrease and optimal semiblind chan- nel estimate. The first order derivative of

I

H

(Y X

j

H)

b with

respect to

H = H

e ;

H

b evaluated at

H = 0

e is zero:

@

@

he

I

H

(Y X

j

H)

b jH=0e

=

@

@

he

h(X)

;

h

H

(X

j

Y H)

b jH=0e

=

R R pHpH(YXjH) (XjYH)

@

@

he

p

H

(X

j

Y H)

b jH=0e

dXdY

=

R R

p

H

(Y

j

H)

@

@

eh

p

H

(X

j

Y H)

b jH=0e

dXdY

=

R

p

H

(Y

j

H)

@

@

he

((

Z

p

H

(X

j

Y H)dX

b

| {z }

=1

)

jH=0e

dY = 0 :

(13) The second order derivative w.r.t.

H

eevaluated at

0

is:

@

@

he

@I

H (YXj

b

H)

@ eh

T

j

e

H=0

=

;E @

@ eh

@lnp

H (Yj

b

H)

@

he

T

j

e

H=0

+

E @

@

he

@lnp

H (YjX

b

H)

@ eh

T

j

e

H=0

= J

Y

(H)

;

J

YjX

(H) J

Y and

J

YjX are FIMs describing the MI decrease due to(14) the channel estimation error and evaluated at

H = 0

e . To

compute

J

Y and

J

YjX, we consider a receiver point of view in which, given a certain realization

H

and a certain

H

b,

H

e is an unknown constant. Then

p

H

(Y

j

X H) = p(V +

b

( H +

b

H)X

e j

X H) = p(V +

b

H X

e j

X) = p

He

( V

ej

X)

where

V = V +

e

H X

e and the variable

V

ej

X

follows the same distribution as

V

but with an offset (mean)

H X

e (notation

assumes

N

i

= 1

). Now:

J

YjX

(H) =

;E

@

@

he

@lnp

f

H (

e

VjX)

@ eh

T

j

e

H=0

!

J

Y

(H) =

E

(

BBT

)

B

=

@lnpH(YjH)b

@

he j

e

H=0

(4)

B

=

pH(Y1jH)@RpH(YXjH)dXb

@

he j

e

H=0

=

pH(Y1jH)@RpH(YjXH)p(X)dXb

@

he

j

e

H=0

=

pH(Y1jH)R @pHf(VejX)

@ eh

j

e

H=0

p(X)dX

For

H

b in a small neighborhood of

H

, the MI decrease is approximated by a quadratic function:

I

H

(Y X

j

H)

b ;

I

H

(Y X

j

H) =

;ehT

W(H)

eh

=

;

(

bh;h

)

T

(J

YjX

(H)

;

J

Y

(H))(

bh;h

)

(15)

The weighting matrix

W(H) = J

YjX

(H)

;

J

Y

(H)

is non-

negative definite since

I

H

(Y X

j

H)

b

I

H

(Y X

j

H)

.

Example: For white Gaussian and decorrelated

V

i and

X

i,

V

eij

X

i is Gaussian with mean

H X

e and covariance

2v

I

:

p

He

( V

eij

X

i

) = (2

v2

)

;NiNr x

exp

;jVei;(INi2vH)e Xij2

. Then

J

Yi

(H) = 0

and

J

YijXi

(H) = N

i2x2v

I

. As a re-

sult

W

i

(H) = W

i

= N

i2x2v

I

is constant. More gen- erally, for any Gaussian noise

V

i and zero mean input

X

i;

J

Yi

(H) = 0

,

J

YijXi

(H) = J

YijXi is constant and

W

i

(H) = W

i

= J

YijXi.

We now want to find the best channel estimator, that minimizes the MI decrease. Then we need to mini- mize the cost function

I(Y X

j

H)

;

I(Y X

j

H(S)) =

b

Ef

I

H

(Y X

j

H)

;

I

H

(Y X

j

H(S))

b gwhere for every block

i

,

H

b is based on

S

(i)

= (X

p

X

1i;1

Y

i

)

. Asymptotically

min

hb(S)

(I(Y X

j

H)

;

I(Y X

j

H(S)))

b

) E

min

hb(S)

Ef

(

bh

(S)

;h

)

T

W(H)(

bh

(S)

;h

)

j

S

g

]

)

hbopt

(S) = (

Ef

W(H)

j

S

g

)

;1

(

Ef

W(H)

hj

S

g

)

(16) So bhopt is a weighted MMSE estimate. For centered in- put

X

and Gaussian noise

V

,

W(H) = W

and the opti- mal channel estimator is the MMSE estimatorbhopt

(S) = W

;1 Ef

W

hj

S

g

=

Efhj

S

g. The minimum mean MI de- crease w.r.t. the perfectly known channel case is:

I(Y X

j

H)

;

I(Y X

j

H

bopt

(S)) =

E hT

W(H)

h

;

(

Ef

W(H)

hj

S

g

)

T

(

Ef

W(H)

j

S

g

)

;1

(

Ef

W(H)

hj

S

g

)]:

and all this so far for block

(i)

. The minimum mean MI de- crease for the complete burst becomes, using the recursive MI decomposition of section 2:

P

Q

i=1

I(Y

i

X

ij

H

(i)

)

;

I(Y

i

X

ij

H

bopt(i)

(S

(i)

))]

=

PQi=1 EfhT

W(H

(i)

)

h

;

(

Ef

W

ihj

S

(i)g

)

T

(

Ef

W

ij

S

(i)g

)

;1

(

Ef

W

ihj

S

(i)g

)

g

]:

where

W

i

= W

i

(H

(i)

) = J

YijXi

(H

(i)

)

;

J

Yi

(H

(i)

)

.

In the case of a constant channel over the different blocks (i.e

H

(i)

= H i = 1:::Q

),PQi=1

I(Y

i

X

ij

H

(i)

)

gets

modified to

I(Y

d

X

dj

H)

and

W

i

(H

(i)

)

to

W

i

(H)

.

3.3 Deterministic channel

In this case we have no prior information on the channel. The channel is constant during the burst with an unknown value

H

. The coherent capacity is then

I(Y

d

X

dj

Y

p

X

p

H)

. For a given realization

Y

oof

Y

and a

(variable) channel

H

b, let’s define

G(Y

o

X

p

H) = h(X)

b ;

E

(

;

lnq(X

j

Y

o

H)) = h(X

b d

)

;E

(

;

lnq(X

j

Y

po

Y

do

H))

b

where the expectation here is with respect to

X

d (

X

p is

known), and

q(X

j

Y

po

Y

do

H) = p(X

b j

Y

po

Y

do

H)

jH=Hb.

Let’s introduce a partition of

Y

d in which the different blocks have the same lengths

N

i

= N

1

i = 1:::Q

and

are i.i.d. Then in 1

T

G(Y

o

X

p

H) =

b T1

lnq(X

pj

Y

po

H) +

b

P

Q

i=1

(h(X

i

) +

E

lnq(X

ij

Y

io

H))]

b , the averaging over the blocks tends asymptotically to an expectation w.r.t.

Y

do. We

conclude that :

lim

T!1T1

G(Y

o

X

p

H) = lim

T!1T1f

lnp(X

pj

Y

po

H)

; P

Q

i=1

(h(X

i

)

; E

lnp(X

ij

Y

io

H)

g

= I(yx

j

H)

(17) So

asymptotically

G(Y

o

X

p

H)

approaches the mutual infor- mation. Let’s define

G

1

( H) = lim

b T!1T1

G(Y

o

X

p

H)

b .

G

1

(H)

;

G

1

( H)

b

=

N11 EX1Y1jH

lnp(X

1j

Y

1

H)

;

lnq(X

1j

Y

1

H)]

b

=

N11 R

p(Y

1j

H)(

R

p(X

1j

Y

1

H)ln

p(Xq (X1jY1H) 1

Y

1 j

b

H)

dX

1

)dY

1

=

N11 EY1jH

D(p(X

1j

Y

1

H)

jj

q(X

1j

Y

1

H))]

b

0

(Kullback-Leibner distance). This means that

G

1

( H)

b is

maximized when

q(X

1j

Y

1

H) = p(X

b 1j

Y

1

H)

. This fact, combined with the hypothesis that the training part is suf- ficiently informative to allow, together with the blind in- formation, complete channel identifiability, ensures that asymptotically (for

T

d and

T

p big enough)

G(Y

o

X

p

H)

b

is maximized for

H = H

b . We can hence use the maxi- mization of

G(Y

o

X

p

H)

b as optimization approach to find a consistent channel estimator.

This method is related to the first iteration of an itera- tive MAP/ML estimation approach for input signal/channel with EM applied to the ML estimation part for the chan- nel. The maximum a posteriori (MAP) estimate of

X

d in

the MAP/ML approach is:

X

dMAP

= arg

Xd

max

XdH

0

lnq(X

j

Y

o

H

0

) :

(18)

Various solution techniques exist for this type of problem.

Consider an alternating maximization (between

X

dand

H

)

approach in which expectation over

X

dis introduced when maximizing over

H

(EM-like approach). The iterations comprise two steps. The first step for the first iteration gives:

H = arg max

b H0 E

(lnq(X

j

Y

o

H

0

))

= arg max

H0

G(Y

o

X

p

H

0

) :

(19)

(5)

This is a semiblind cost function for the channel estimation.

The second step consists of the MAP estimation of the input assuming the channel

H

0

= H

b.

Remark1: The recursive decomposition of section 2 re- mains valid in the deterministic channel case, and we can process by successive detection of the symbols (blocks). To have an acceptable algorithm complexity, one can choose from a variety of channel updating techniques. Along the lines of section 2, this approach allows to maximize the MI for large

T

.

Remark2: Similarly to the Bayesian case, we can evaluate the asymptotic MI decrease aseh

T

W(H)

eh with

W(H) = J

YjX

(H)

;

J

Y

(H)

. In the deterministic case however, the direct minimization of the MI decrease does not lead to a meaningful channel estimator.

4 Correlated MIMO channel model

In order to improve channel estimation and reduce ca- pacity loss, it is advantageous to exploit correlations in the channel, if present. So consider the frequency-flat MIMO channel:

H (N

r x

N

tx

)

h

=

vect

(H)

. The correlated channel model we suggest is:

h

= S

g (

+ g

0hfor direct path,j

g

0j

= 1

)

where the elements ofgare taken to be i.i.d. Gaussian for a stochastic model. The correlations are captured by

S

.

Special case 1: Bell-Labs/Saturn model:

H = R

1=2r

G R

H=2t )

S = R

H=2t

R

1=2r

g

=

vect

(G)

Special case 2: Cioffi-Raleigh model (multipath model):

H =

X

i

g

i aibHi )

S =

b1a1 b2a2

] :

The modelh

= S

g is straightforwardly extendible to the non-zero delay-spread case.

5 Observations

Capacity approaching channel estimation should exploit prior info + data/decision aided info + (Gaussian) blind info.

The symbol-wise decomposition of the block fading chan- nel capacity involves for each symbol position in a burst a channel estimate that is based on: prior channel distri- bution info, training and detected inputs up to that symbol position, (Gaussian) blind info in remaining channel out- puts. Hence, symbol-wise Gaussian semiblind (Bayesian) channel estimation is required (blind parts: detected data + Gaussian undetected data). To have asymptotically (in SNR and burst length) negligible capacity loss, enough training symbols are required to have (deterministic) identifiability of the parameters that cannot be identified blindly (from the Gaussian undetected symbols), hence blind (Gaussian) info reduces training data requirements. Exploiting channel correlations, the (effective) number of degrees of freedom in the channel and hence the training requirements get re- duced. The channel with i.i.d. entries, while optimal from a capacity point of view, is the worst case from the channel estimation point of view.

The recursive mutual info decomposition may suggest a practical approach for channel estimation. However, sim- pler practical approaches would pass through the bursts it- eratively, with semiblind (blind info = Gaussian undetected symbols) channel estimation in the first pass, and semiblind (blind info = detected data) channel estimation in the next iterations. Prior channel info (and

S

in the channel model) gets estimated (sufficiently well) by considering the data in multiple bursts jointly (assuming these parameters are in- variant across a (large) set of bursts).

Whereas we have considered block fading so far in this paper, we conjecture that these results extend to the continu- ous transmission (CT) case: in steady-state, channel estima- tion should be based on the semi-infinite detected past sym- bols, and semi-infinite future blind channel information. A Gauss-Markov model for the channel variations with a cer- tain Doppler bandwidth will prevent perfect channel extrac- tion from this infinite data though. Finally, the proposed channel model is useful for the introduction of partial chan- nel knowledge at the transmitter. Indeed, if the transmit- ter can know the channel correlations summarized in

S

in

h

= S

gand only lacks knowledge of the fast fading pa- rametersg, the channel capacity may be close to that of the known channel case.

References

[1] E. Biglieri, J. Proakis and S. Shamai “Fading chan- nels: information-theoretic and communications as- pects”. IEEE Trans. Info. Theory, vol. 44, no. 6, pp. 2619-2692, Oct 1998.

[2] T.L. Marzetta and B.M. Hochwald “Capacity of a mo- bile multiple-antenna communication link in Rayleigh flat fading ”. IEEE Trans. Info. Theory, vol. 45, no. 1, pp. 139-157, Jan 1999.

[3] L. Zheng and D. Tse “Communication on the Grass- mann Manifold: A Geometric Approach to the Non- coherent Multiple Antenna Channel”. To appear in IEEE Trans. Info. Theory.

[4] M. Medard “The effect upon channel capacity in wire- less communications of perfect and imperfect knowl- edge of the channel”. IEEE Trans. Info. Theory, vol. 46, no. 3, pp. 933-946, May 2000.

[5] I.E. Telatar. “Capacity of Multi-antenna Gaussian Channels”. Eur. Trans. Telecom., vol. 10, no. 6, pp. 585-595, Nov/Dec 1999.

[6] B. Hassibi and B. M. Hochwald. “How Much Training is Needed in Multiple-Antenna Wireless Links?”. Sub- mitted to IEEE Trans. Info. Theory.

[7] A. Medles, D.T.M. Slock and E. De Carvalho “Lin- ear prediction based semi-blind estimation of MIMO FIR channels”. In Proc. Third Workshop on Sig- nal Processing Advances in Wireless Communications (SPAWC’01), pp. 58-61, Taipei, Taiwan, March 2001.

Références

Documents relatifs

puts, most of the blindmultichannelidenticationtech- niques are not very robust and only allow to estimate the channel up to a number of ambiguities, especially in the MIMO case. On

On the other hand, all current standardized communica- tion systems employ some form of known inputs to allow channel estimation. The channel estimation performance in those cases

In this study we propose a technique to estimate the downlink (DL) channel spatial covariance matrix R d in realistic massive MIMO systems operating in the frequency division

Blind channel estimation techniques have also been devel- oped for time-varying channels, by using so-called Basis Expansion Models (BEMs), in which the time-varying chan-

This paper focuses on the capacity of the underwater acoustic communications (UAC) channel under realistic as- sumptions: time-varying multi-paths channel, modelled as a

If an abstract graph G admits an edge-inserting com- binatorial decomposition, then the reconstruction of the graph from the atomic decomposition produces a set of equations and

Let X be an algebraic curve defined over a finite field F q and let G be a smooth affine group scheme over X with connected fibers whose generic fiber is semisimple and

We prove that there exists an equivalence of categories of coherent D-modules when you consider the arithmetic D-modules introduced by Berthelot on a projective smooth formal scheme X