Block-Synchronous DS/CDMA
Giuseppe Caire
Mobile Communications Group
Institut EURECOM
2229 Route des Cretes
06560 Valbonne, France
Tel: +33 (0)4 93002904, Fax: +33 (0)4 93002627
E-mail: [email protected]
Urbashi Mitra
Department of Electrical Engineering
The Ohio State University
756 Dreese Labs, 2015 Neil Avenue, Columbus, OH 43210
Tel: (614) 292-3902, Fax: (614) 292-7596
Email: [email protected]
August 7, 2000
Submitted to IEEE Trans. on Commun.: July 1999.
Revised: August 2000.
Abstract
Uplinkchannelestimationforablock-synchronouschip-asynchronousDS/CDMA
systemasproposedforthetime-divisionduplexoptionof3rdgenerationcellularsys-
temsisconsidered. Trainingmidamblesareemployedforjointchannelestimationof
all users. Thestandardunstructured approachbasedon modelingtheeectiveuser
channelsasunknownFIRltersis comparedwithtwo structured methodsthatex-
ploitaprioriknowledgeabouttheuserchannelssuchasthemaximumdelay-spread,
the transmit chip-shaping pulse and the path delays. Since these are usually un-
known,a low-complexity estimator for thepath delays of all users is derived from
a maximum-likelihood approach. For all channel estimators, optimal training se-
quences sets basedon perfect root-of-unity sequencesarefound. Forthese optimal
sets,itisshownthatthereductioninchannelestimationmean-squarederrorofthe
structured estimator versus the unstructured estimator is exactly the ratio of the
number of structured parameters to unstructured parameters. Simulation results
showthatstructuredchannelestimationprovideadvantages upto 4dBintermsof
output signal-to-interference plusnoise ratio with respect to unstructured estima-
tion, forlinearMMSE detection. In contrast, forconventional single-usermatched
ltering, unstructuredestimationprovesto besuÆcientlygood.
Keywords: DS/CDMA, channel estimation, multiuser detection.
1 Introduction
One of the proposals for UMTS, the European 3rd generation mobile communication
system, is based on hybrid TDMA/CDMA (briey denoted as T-CDMA) [1, 2, 3]. In
this scheme, multiple access is regulated by a TDMA slot structure similar to GSM [4],
and time-divisionduplexing (TDD) is adopted. User signals are direct-sequence spread-
spectrum, with a limited processing gain L (typically L = 16). In contrast to standard
TDMA, in T-CDMA multiple users per cell are allowed to transmit over the same time
slot.
In the uplink (mobile-to-base), users have a coarse commontiming reference and are
abletoaligntheirsignal\blocks"(or\bursts") withtheslotreferenceofthe basestation.
Due to imperfect synchronization and to the dierent multipath channels for each user,
the signal blocks experience some misalignment. This misalignment is compensated for
by inserting guard intervals of appropriate duration. Therefore, the system is block-
synchronous, but cannot beconsidered chip-synchronous.
Provided that the blocks are suÆciently short with respect to the channel coherence
time [5], the user channelscan be considered time-invariantover the block duration. On
theotherhand,duetotheTDMAdynamicuserallocationovertheslotsandtothebursty
nature of transmission, tracking the channels from block to block might be infeasible.
Hence, we shall consider blockwise channel estimation, where the receiver estimates the
user channelsblock-by-block without tracking across dierent blocks.
Weconsider training-basedjointchannel estimationof alluplink usersas in[3]. Each
signal block contains a training sequence of known chips, in a xed nominal position
(typically,inthe middleof the block [1,4]). The base-stationreceivermakesuse ofthese
trainingsequences inorder toestimatethe channel impulseresponses ofall users. Least-
Squares (LS) training-basedchannel estimation iswell-known inthe single-user case (see
[6, 7] and references therein) and has been extended to the block-synchronous multiuser
case in [3]. Most literature considers a chip-matched lter front-end, and chip-ratesam-
pling without an explicit timing reference. Under the assumption that the chip-shaping
transmit pulse is bandlimited with non-zero excess bandwidth [5], chip-rate sampling
does not providesuÆcient statisticsfor data detection unless the correcttiming epoch is
chosen. Noticethat,duetothefrequency-selectivechannelsandchip-asynchronous trans-
mission,chip-ratesamplingisalwayssuboptimal. SuÆcientstatisticsfordetectioncan be
obtained by low-pass ltering and samplingat a frequency larger than the chip-rate. In
this case, the chip-matched lter front-end is to be avoided, since in general it produces
a colored discrete-time noise sequence. We shall consider an ideal low-pass lter front-
end with bandwidth equal to half of the receiver sampling rate. In this way, the noise
aftersamplingiswhite,andLSisequivalenttomaximum-likelihood(ML)estimation(for
Gaussiannoise)withoutrequiringanynoisewhitening. Moreover,the discrete-timechan-
nel responses are naturally represented as polyphase lters and each \sampling phase"
can beestimated independently of the others withoutloss of optimality.
In standard LS (or ML) training-based channel estimation, the user channels are
modeled as FIR lters with unknown coeÆcients [3], and no a prioriinformation about
the channels is exploited. We refer to this approach as \unstructured" channel estima-
tion. More recently, a priori knowledge of the structure of the users channel impulse
responses has been exploited in order to improve estimation. In some works (see for ex-
ample [8, 9, 10, 11]), user channels are modeled as discrete multipath characterized by
a set of path delays and gains and rectangular chip-shaping pulses are assumed. After
chip-matched ltering, the overall impulse responses are linear combinations of delayed
triangular pulses. Since triangular pulses are piecewise linear, the discrete-time overall
channelsare linearfunctionsbothofthe multipathgainsand ofthedelays,andthis isex-
ploitedfor estimation. Withthisapproach the channelsare represented directly interms
of their \physical" parameters (multipathgains anddelays). Then,a minimalnumberof
unknowns needs to be estimated. On the other hand, these methods do not generalize
easily toarbitrary chip-shaping pulses (typically, root-raised-cosine (RRC) pulses with a
givenroll-ofactor[5]). Anothermethod(see[12,13]and referencestherein) exploitsthe
factthatthe discrete-timeoverall userchannelsarefound tolieinthecolumn-spaceofan
a prioriknown matrixdetermined bythe chip-shaping pulseand by themaximum delay-
spread. This has the advantage of being applicable to general bandlimited chip-shaping
pulses but yields generally more unknowns than the methods based on physical channel
parameterization.
In this paper, we propose two types of \structured" channel estimators which can
be applied to any arbitrary (approximately bandlimited) chip-shaping pulse. Our rst
methodisessentiallythemultiuserversionof[12],andexploitsonlythecoarseinformation
represented by the maximum delay-spread and by the chip-shaping pulse. Our second
methodis based onthe physical channel parameterizationand exploits the knowledgeof
thepathdelaysofallusers. Inpractice, thisinformationisnotusuallyavailable. Thus, we
proposeatwo-stepapproachwhererstthepathdelaysareexplicitlyestimated,andthen
used inthe structuredchannel estimator. StartingfromaML approach,wederive alow-
complexitypathdelayestimatorsuitedforthe rststepofstructured channelestimation.
For both structured and unstructured channel estimation, we discuss the problemof
training sequence optimizationand we show that the rich family of perfect root-of-unity
sequences (PRUS) [14] provides optimal training sequence sets for all desired lengths.
Remarkably, the same set of sequences is optimal for both structured and unstructured
estimation. Ifoptimalsequences are used,weareable toshowthat theestimation mean-
squared error (MSE) reduction provided by structured over unstructured estimation is
given by the ratio between the number of structured and unstructured parameters. This
result is not generally true for anarbitrary choice of the trainingsequences.
At the base-station, the channel estimates are used to compute the coeÆcients of a
bank of receiving lters, whoseoutput is sampledat the symbolrate and sent toa bank
of single-user decoders [15, 16]. In this case, as discussed extensively in [15, 16, 17, 18],
assuming single-user capacity-achieving Gaussian codes 1
with interleaving over a large
numberofslots,thesystemspectraleÆciencyisgivenby=E[log
2
(1 + SINR)]bit/s/Hz,
where is a proportionality constant that depends on the chip-shaping pulse excess
bandwidth[5]andonthe\channelload"(i.e.,thenumberofusersdividedbythespreading
gain). Ontheotherhand,ifeachslotisindependentlyencodedanddecoded,thecodeword
error rateiscloselyrelatedtothe informationoutageprobability[19] and,inthelimitfor
a large number of symbols per slot, the system spectral eÆciency subject to a required
outage probability is given by = log
2
(1+) where is the largest value for which
Pr(SINR ) . In all cases, the system spectral eÆciency is determined by the
cumulative distribution function (cdf) of the SINR at the output of the linear receivers.
Classicalchoicesforthelinearltersareeitherthesingle-usermatchedlter(SUMF)and
the linear minimum MSE (LMMSE) lter [22]. The SUMF is normally implemented in
current DS/CDMA systems as a rake receiver [5], and the LMMSE one of the solutions
proposed for T-CDMA [2].
Motivatedbytheabovearguments,wecompareourestimatorsbyevaluatingtheSINR
cdf atthe outputof the SUMF and LMMSEreceivers when the lter coeÆcientsare cal-
1
With modern concatenatedcodinganditerativedecodingtechniques[20,21]theGaussian capacity
can becloselyapproachedbypracticalnite-complexitycodes.
culatedfromthechannelestimates. Simulationsshowthatstructuredchannel estimation
provide advantages up to 4 dB in terms of output SINR with respect to unstructured
estimation, for LMMSE detection. In contrast, for conventional SUMF detection, un-
structured estimation provesto be suÆciently good.
Thispaperisorganizedasfollows. InSection2,thesystemmodelisdened. Section3
presents the maximum-likelihoodunstructured channel estimator and the corresponding
optimal training sequences. In Section 4, we derive the structured channel estimators.
In addition,it is shown that the trainingsequences previously found are optimalalso in
this case. Furthermore, a multipath delay estimator is also derived. Section 5 presents
simulation results and inSection 6we providesome concludingremarks.
Notation. i) v N
C
(v;R) means that the random vector v is complex Gaussian
with circularsymmetry, with mean E[v]=v and covariance E[vv H
] vv H
=R. ii) I
n
denotes the nn identity matrix. When the dimension is clear from the context, the
subscript n may be omitted. iii) Æ
i;j
denotes the Kronecker delta symbol, equal to 1 if
i =j and 0 otherwise. iv) / means \proportionalto". v) Superscripts T
and H
indicate
transpose and Hermitiantranspose, respectively. vi) sinc(x)
=sin(x)=(x). vii) and
? denote Kronecker product [23] and convolution, respectively.
2 Signal model
Fig. 1 shows the block diagram of the complex baseband equivalent system at hand,
with K users. Since blockwise channelestimation is considered, we focus onaparticular
\reference" time slot. The k-th user multipath channel impulse response is assumed to
betime-invariantoneach slotand is expressed by
c
k (t)=
P 1
X
p=0 c
k;p
Æ(t
k;p
) (1)
We donot make any explicit distinction between the delay introducedby non-idealuser
synchronization and the delay introduced by multipath propagation. These eects are
incorporated in the channel impulse response so that the delays
k;p
account also for
synchronization errors. Without loss of generality, we assume that all users have the
same number P of paths: if c
k
(t) has less than P paths, the gains corresponding to the
missing paths are zero. Also, we assume that all users have the same transmit power:
dierentreceived signal-to-noiseratios(SNR) are taken into accountby multiplyingeach
user channelcoeÆcients by the corresponding amplitude factor.
We assume that the system isdesigned tocope with a given maximum overall delay-
spread (a design parameter known a priori [1]). An initial synchronization procedure
(not taken intoaccount here) allows user k to transmitonly if max
p f
k;p
g , i.e., if it
issynchronized tothebase-stationslottimereferencewith overalldelay-spreadnot larger
than [24]. In general, is much shorter than the slot duration but longer than the
chip interval. Then, the system is block-synchronous but not chip-synchronous.
The DS/CDMA signal transmittedby user k is given by
u
k (t)=
X
m b
k [m]
"
L 1
X
`=0 a
k
[mL+`] (t (mL+`)T
c )
#
(2)
where b
k
[m] is the m-th data symbol, belonging to some unit-energy complex signal al-
phabet (e.g., BPSK, QPSK, 16QAM, etc ...), T
c
is the chip interval, a
k
[n] is the n-th
(complex)chip, Listhe spreadinggain and (t)isthe chip-shapingpulse, commontoall
users. The pulse (t) is obtained by truncatinga bandlimited waveform over aninterval
ofdurationT
c
,wherethe integerissuÆcientlylargeinordertomakethebandwidthof
(t)approximatelyequalto(1 + )=(2T
c
)(with0< 1)andsuchthat R
() (+ t)
d
satises approximately the Nyquist criterion [5]. For example, in the UMTS T-CDMA
system (t) is obtained by truncating aRRCpulse [5]with roll-ofactor =0:22 [1].
Each user transmitsa trainingsequence of known chips inaxed positioninthe slot.
During training no data is transmitted (i.e., the data symbols are all equal to 1) and
the number of training chips is not necessarily equalto a multiple of L. Without lossof
generality, wex the time axis originso that the trainingsequence istransmitted onthe
interval [ (Q 1)T
c
;(M 1)T
c
], forintegersQ and M, where
Q=d=T
c
e+ (3)
is chosen to be not smaller than the duration, expressed in chip intervals, of the overall
impulse responses
g
k
(t)= (t)?c
k
(t) (4)
for allusers.
Thereceived signal,given by the superposition of allusers plus Gaussianbackground
noise (accounting also forouter-cell interference), can be writtenas
r(t)= K
X
k=1 u
k (t)?c
k
(t)+(t) (5)
where(t)isacomplexcircularly-symmetricGaussianprocesswithautocorrelationfunc-
tion E[(t)(t )
]=N
0 Æ().
The baseband receiver front-end is an ideal low-pass lter with bandwidth W=2 and
amplitude 1=
p
W, whose output is sampled at rate W = N
c
=T
c (N
c
2 is an integer),
withoutanyexplicittimingreference. ThechannelestimatorcollectsN
c
M samplesofthe
received signalover the interval [0;MT
c
]. This are given by
r[nN
c
+`]= K
X
k=1 Q 1
X
m=0 g
k;`
[m]a
k
[n m]+[nN
c
+`] (6)
for n = 0;:::;M 1 and ` = 0;:::;N
c
1, where [j] is the discrete-time low-pass
lterednoisesequence withi.i.d. samplesN
C (0;N
0
),andwherewedenethepolyphase
representation of the discrete-time low-pass lteredoverall channelimpulse response
g
k;`
[m]
= 1
p
W g
k ((mN
c
+`)=W) m=0;:::;Q 1 (7)
for ` =0;:::;N
c
1. Forlater use, we collectthe samples of the `-th samplingphase of
the k-thuser channel inthe vector g
k;`
=(g
k;`
[0];:::;g
k;`
[Q 1]) T
. When Q satises(3),
for all realizations of the users synchronization errors and channel multipath delays the
samples (6)contains only the contributionof known training chips (see Fig. 2).
3 Unstructured ML channel estimation
InthissectionwereviewstandardMLchannelestimationasproposedin[3]andwederive
optimal sets of trainingsequences.
The channel estimatorformsthe vectors r
`
=(r[`];r[N
c
+`];:::;r[(M 1)N
c +`])
T
,
for ` =0;:::;N
c
1. After some standard algebra [3, 6, 7], these can be written in the
compact matrix form
r
`
=Ag
` +
`
(8)
where g
`
= (g T
1;`
;:::;g T
K ;`
) T
is the total user channel vector corresponding to the `-th
sampling phase, = ([`];[N +`];:::;[(M 1)N +`]) T
is a vector of i.i.d. noise
samples,andA=[A
1
;:::;A
K
]isaMKQblockmatrixcontainingonlytrainingchips,
whose k-thM Q block is given by the convolution(Toeplitz)matrix
A
k
= 2
6
6
6
6
6
4 a
k
[0] a
k
[ 1] a
k
[ Q+1]
a
k
[1] a
k
[0] a
k
[ Q+2]
.
.
.
.
.
.
.
.
.
a
k
[M 1] a
k
[M 2] a
k
[M Q]
3
7
7
7
7
7
5
(9)
Since the noise in (6) is Gaussian and white, the ML estimation of fg
0
;:::;g
N
c 1
g from
the observationfr
0
;:::;r
N
c 1
gsplitsintotheindividualMLestimationsofg
`
withobser-
vationr
`
, for `=0;:::;N
c
1. Moreover, since
`
has i.i.d. components,ML estimation
is equivalentto the simple LSestimation [25]:
b g
`
=arg min
g jr
`
Agj 2
(10)
whose solution isreadily given by
b g
`
=(A H
A) 1
A H
r
`
(11)
where we assumethat A has full column-rank, i.e., rank(A)=KQ.
Theabovemethodisreferredtoas\unstructured" sinceittreatsg
`
asavectorofKQ
unknowns, withouttaking intoaccountthatthe elementsofg
`
depend onthesame setof
\physical" channel parameters (the delays and channel gains of the channel model(1)).
3.1 Optimal training sequences
The estimation error vector e
`
= b
g
` g
`
= (A H
A) 1
A H
`
is N
C
(0;), where
=
E[e
` e
H
`
] = N
0 (A
H
A) 1
. The resulting normalized estimation MSE of the unstructured
estimatoris given by
2
unstr
= 1
KQN
c Nc 1
X
`=0 E[je
` j
2
]= N
0
KQ
Tr (A H
A) 1
(12)
It is well-known [3, 6, 7], that optimal training sequences minimizing 2
unstr
must satisfy
A H
A / I. This fact has been shown either by using matrix inequalities [6] or by using
a matched lter argument [7]. Here, we provide an alternative simple proof by solving a
TheestimationMSEisinverselyproportionaltothechip-energy. Therefore,weimpose
thetraceconstraintTr(A H
A)=E,whereEcanbeinterpretedasthetotalenergydevoted
to training. Let f 2
i
: i = 1;:::;KQg denote the eigenvalues of A H
A (all positive, by
construction). Optimalsequence sets are solutions of
(
minimize Tr (A H
A) 1
= P
i 1=
2
i
subject to Tr (A H
A)= P
i
2
i
=E
(13)
ByapplyingLagrangemultipliersweobtainthattheminimumisachievedbytheconstant
eigenvalues 2
i
=E=(KQ). Since A H
A is positive denite, there exists a unitary matrix
Q such that A H
A=E=(KQ)QIQ H
, which yieldsA H
A=E=(KQ)I. Hence, in order to
minimizethe MSE, A H
A must be proportionaltothe identity matrix.
The construction of optimal training sequences satisfying A H
A / I has been inves-
tigated in several papers (see [3, 6, 7, 26, 14, 27] and references therein). Most related
research considers sequences constructed from simple symbol alphabets, like BPSK and
QPSK. Unfortunately, this requirement is too restrictive and optimal sequences cannot
befound for most traininglengths M. In some works (see [3,26]), exhaustive search for
BPSK sequences minimizingthe o-diagonalelementsof A H
A ismade forsmallM,and
heuristicconstruction of goodsequences with smallo-diagonalelements of A H
A iscar-
riedout forlargeM. Otherworksproposethe useofnon-constantenvelopesequences [7].
We observe that requiring simple symbol alphabets like BPSK or QPSK does not
bring a real complexity advantage, especially if modern DSP architectures are used for
implementation. Rather,aconstantenvelopeallowsmoreeÆcientutilizationofthetrans-
mitter power amplier[4], and has a major impact onthe battery life of the mobile ter-
minals. In this paper, we require the chips to belong to a N-th root-of-unity alphabet
A
N
= fe j2i=N
: i = 0;:::;N 1g, for some integer N. This choice is enough to obtain
optimal training sequences for any desired training length M, while preserving the con-
stant envelope.
2
In particular, an optimal set of training sequences can be derived from
a single PRUS [14]:
Denition: PerfectRoot-of-UnitySequences. Thesequencex=(x
0
;:::;x
M 1 )2
2
Forthesakeofprecision,wehavetosaythatevenifthechipshaveconstantmagnitude,thetransmit
signal after the chip-shaping lter is not constant-envelope any longer. Then, for a given non-linear
amplier,theremightbesequencesthatperformbetterthanothers(e.g.,avoidingphasetransitionsof
betweenconsecutivesymbols). Thistypeof\ne-tuning"optimizationisbeyondthescopeofthispaper,
C M
is a PRUS if x
m 2 A
N
, for all m = 0;:::;M 1 and some integer N, and if its
periodic autocorrelation satises
x (n)=
M 1
X
m=0 x
m x
[m n mod M]
=MÆ
n;0
Explicit and simple constructions of PRUSs of any length are provided in [14, 27].
Let x be a PRUS of length M KQ. Then, the k-th user training sequence (a
k
[ Q+
1];:::;a
k
[M 1])is obtained from xas
a
k [m]=
p
E
c x
[m (k 1)Q mod M]
(14)
forallk =1;:::;K andm= Q+1;:::;M 1,whereE
c
isthetransmitchip-energy. By
using (14) into (9), itis easy tocheck that the columns of A are distinct cyclic shifts of
the same PRUS x. Byconstruction, we obtain A H
A =ME
c
I, as desired. The resulting
minimum estimation error isgiven by
2
unstr
= L
M
E
s
N
0
1
(15)
where E
s
=LE
c
is the nominalaverage transmit energy persymbol.
From the implementation point of view, the proposed channel estimation scheme is
extremely simple. Since the columns of A are cyclic shifts of x, the vector A H
r
` can
be seen as the result of the cyclic convolution of x with r
`
. Then, it can be computed
eÆciently in the frequency domain, by using FFT. In particular, the discrete Fourier
transform of x can be precomputed and stored in the receiver. Moreover, no matrix
storage or matrix-vector product are needed since (A H
A) 1
/ I. Finally, the gb
` 's for
` =0;:::;N
c
1can becomputed in parallel,by N
c
identical processors.
4 Structured channel estimation
In this section we propose estimation methods that exploit some a priori information
of the \structure" of the channel. Our rst method (referred to as Type I structured
estimation) exploits coarse a priori knowledge about the channel responses represented
by themaximum delay-spreadandby thechip-shapingpulse (t). Our secondmethod
(referred to asType II structured estimation)exploits ner a prioriknowledge about the
channel responses, represented by the sets of delays f
k;p
: p = 0;:::;P 1g for all
4.1 Channel structure
By using the factthat (t) isbandlimited 3
and by using (1)and (4)in (7)we can write
explicitly
g
k;`
[m]= Nc 1
X
i=0
(i=W)ec
k [mN
c
+` i] (16)
where we dene the low-pass lteredand sampled channel response
e c
k [j]
= P 1
X
p=0 c
k;p
p
W
sinc (j
k;p
W) (17)
Let e
c
k
=(ec
k
[0];:::;ec
k
[D 1]) T
bethevectorofsignicantchannelresponsesamples,where
D is a suitableinteger known a priori(it depends on), let c
k
=(c
k;0
;:::;c
k;P 1 )
T
and
dene the QD convolution matrix
`
with (i;j)-th element
[
` ]
i;j
=
iN
c
+` j
W
(18)
for i= 0;:::;Q 1 and j = 0;:::;D 1, and the DP interpolation matrix
k with
(j;p)-th element
[
k ]
j;p
=sinc(j
k;p W)=
p
W (19)
Then, (16) and (17) can bewritten inmatrix form as
g
k;`
=
` ec
k
e
c
k
=
k c
k
(20)
respectively. We dene ec
= (ec T
1
;:::;ec T
K )
T
, c
=(c T
1
;:::;c T
K )
T
, the block-diagonalmatrix
=diag(
1
;:::;
K
) of dimension KDKP and the Kroneckerproduct matrix
`
=
I
K
`
of dimension KQKD. We can writeec=c and, by recalling the denition
of g
`
, we get g
`
=
`
ec. Finally, we stack the dierent sampling phases g
`
into a single
vector g
=(g T
0
;:::;g T
Nc 1 )
T
toobtain
g= ec=c (21)
where we dene the KQN
c
KD block matrix
= 2
6
6
4 0
.
.
.
Nc 1 3
7
7
5
3
Weneglectthedistortionduetotruncation,sincepracticalsystemsaredesignedtokeepthisdistortion
and where we dene the KQN
c
KP matrix
= . Equations (21) dene the a
priori structure of the channel impulse responses. The matrix is determined by the
chip-shaping pulse (t) and by the maximum delay-spread , therefore it is known by
the receiver. Thematrixisdeterminedalsobythe pathdelays f
k;p
:p=0;:::;P 1g
for all k = 1;:::;K, and by the number of paths P, which are generally not known a
priori.
4.2 Type I structured estimation
Let r
=(r T
0
;:::;r T
N
c 1
) T
and
=( T
0
;:::; T
N
c 1
) T
. Then,
r=[I
Nc
A]g+ (22)
whereAisdened in(9). From therstequalityof(21) wehavethatthe desiredchannel
vector g lies inthe column-spaceof the a prioriknown matrix . If the null-space of
is non-trivial,this informationcan be exploitedto improvechannelestimation.
We observe that might be ill-conditioned for large D and/or large N
c
, as it is
obtained by oversampling the bandlimited waveform (t). In order to cope with this
possibility,we use singular-valuedecomposition (SVD) [28] and re-parameterizeour esti-
mation problem. We can write
=U 0
S 0
(V 0
) H
(23)
where S 0
is 0
0
diagonal with positivedecreasing diagonalelements (singularvalues),
0
is the rank of , and U 0
, V 0
are rectangular matrices with orthonormal columns
and dimension KQN
c
0
and KD 0
, respectively. Then, g = ec = U 0
d 0
, where
d 0
= S 0
(V 0
) H
e
c. Notice that any component of e
c in the null-space of has no impact
on the observed channel response g. Thus, we can rst project ec onto the orthogonal
complement of the null-space of and then estimate its projection d 0
. The proposed
channel estimation algorithmis asfollows:
1. Obtain the ML estimateof d 0
from the observation r as
b
d = arg min
d
jr [I
N
c
A]U 0
dj 2
= (U
0
) H
[I
Nc (A
H
A)]U 0
1
(U 0
) H
[I
Nc A
H
]r (24)
2. Obtain the Type I structured estimate of g as b
g=U 0
b
d.
4.3 Type II structured estimation
If thedelays
k;p
and the numberof paths P are known, the receivercan compute and
impose that the channel vector g must lieinitscolumn space(see the second equalityof
(21)). By substitutingg =c into(22) weget the linear model
r=[I
N
c
A]c+ (25)
Also in this case it is convenient to re-parameterize the problem by using the SVD
= U
00
S 00
(V 00
) H
, where S 00
is 00
00
diagonal with positive decreasing diagonal el-
ements (singular values), 00
is the rank of , and U 00
, V 00
are rectangular matrices
with orthonormal columnsand dimension KQN
c
00
and KP 00
, respectively. Then,
g =c=U 00
d 00
,where d 00
=S 00
V 00H
c. Since ingeneralneither the
k;p
's nor the number
ofpathsP are aprioriknown,theseparametersmust alsobeestimatedfromthereceived
signal. We propose a two-step approach where rst an estimate b
k;p
of delays
k;p is ob-
tainedand thentheaboveTypeIIstructuredestimatoriscomputedassuming
k;p
=b
k;p .
More precisely, we have:
1. Obtain an estimate the maximum number of paths per user b
P and of the delays
fb
k;p
:p=0;:::; b
P 1g, for allk =1;:::;K.
2. Based on the estimated delays, compute an estimate b
of and the KQN
c b
00
factor b
U 00
in itsSVD.
3. UndertheassumptionU 00
= b
U 00
,obtaintheMLestimateofd 00
fromtheobservation
r as
b
d = arg min
d
r [I
Nc A]
b
U 00
d
2
=
( b
U 00
) H
[I
Nc (A
H
A)]
b
U 00
1
( b
U 00
) H
[I
Nc A
H
]r (26)
4. Obtain the Type II structured estimate of g asgb= b
U 00
b
d.
Since b
is not known a priori, a real-time SVD is needed every time a new estimateof
the users path delays is available.
In general, since U 00
6=
b
U 00
the Type II estimator is biased. We expect that the
performanceoftheproposedestimatordependscriticallyonthequalityofthepreliminary
delayestimation. Ontheotherhand,forperfectknowledgeofthe 's,TypeIIandType
I estimators have the same form. Therefore, for perfect delay information an optimal
training sequence set for the Type I is also optimalfor the Type II. In the next section,
we nd optimal training sequence sets for Type I and Type II structured estimators
(assuming perfect knowledge of the delays for the latter) and we postpone the delay
estimation problem toSection4.5.
Remark: near-far resistance of channel estimators. In multiuser detection, a
receiver is said to be near-far resistant for user k if its error probability goes to zero as
N
0
! 0 for any choice of the other users received powers [22]. Analogously, we say that
a channel estimatorfor user k is near-far resistant if its estimation MSE goesto zero as
N
0
! 0 for any choice of the other users received powers. In particular, if a multiuser
channelestimatorcanbeputintheformbg=g+e,wheretheerrorvectorehasmeanzero
and covariance matrix such that Tr() ! 0 as N
0
! 0, then it is near-far resistant.
According to the above denition, the unstructured and structured Type I estimators
are near-far resistant while the structured Type II estimator is near-far resistant if it is
unbiased, i.e., if the delays are perfectlyknown.
4.4 Optimal training sequences.
We let U = U 0
and d = d 0
(resp., U = b
U 00
and d = d 00
) for Type I (resp., Type II)
estimation, and for Type II we assume perfect knowledge of the delays, i.e., b
U 00
= U 00
.
The error vector inthe estimation of d isgiven by e= b
d d, N
C
(0;), where
=E[ee H
]=N
0 U
H
I
N
c (A
H
A)
U
1
The error vector in the estimation of g is given by Ue, N
C
(0;UU H
). The resulting
normalized estimation MSE isgiven by
2
struc
= 1
KQN
c
Tr UU
H
= 1
KQN
c
Tr () (27)
where we used the fact that U H
U = I. Then, the optimal choice of training sequences
should minimize the trace of subject to a constraint on the total training energy. We
already know that this is obtained by / I. Fortunately, since U has orthonormal
columns, we have immediately that if A H
A / I then also / I. We conclude that
optimal sequences for unstructured estimation are also optimal for Type I and Type II
construction of trainingsequences from a single PRUS of length M KQ, as indicated
in (14), can be successfully appliedhere.
The resultingminimum estimationerror is given by
2
struc
= L
M
E
s
N
0
1
KQN
c
(28)
where = 0
(resp., = 00
) is the rank of (resp., ) for Type I (resp., Type II)
estimation.
Fromanimplementationpointofview,withoptimalsequences,thestructuredchannel
estimator is obtained by b
g / UU H
I
N
c A
H
r, where the proportionality constant is
known. This can be interpreted as the linear post-processing dened by the squared
matrixUU H
oftheunstructured channelestimation
I
Nc A
H
r ofSection3. ForType
Iestimation,UU H
canbeprecomputedandstored,andtheextracomplexityisonlygiven
by the matrix transformation. For Type II estimation, extra complexity is alsoincurred
by explicit delay estimation and by the SVD required to calculateU.
Remark: reduction of the estimation MSE. By comparing (15) with (28), we
see that for optimal sequences, 2
struc
= 2
unstr
= =(KQN
c
), i.e., that the estimation MSE
reduction provided by the knowledge of the a prioristructure of the channel is given by
the ratio between thenumberof structured and unstructured unknown parametersto be
estimated.
The MSE reduction provided by Type I structured estimation can be coarsely eval-
uated by the following intuitive argument. If the channel delay-spread is considerably
larger than the chip-shaping pulse duration, we get that Q < D N
c
Q. The Q D
matrices
k;`
dened in (18) for ` =0;:::;N
c
1 are all obtained by samplingthe ban-
dlimitedpulse (t). Each of thesematriceshas full row-rankQ. However, since the rows
of
k;`
for` >0 areobtained by sampling (t `=W)atrate W (largerthan the Nyquist
rate),fortheNyquistsamplingtheorem[5]theyalllieinthelinearspacegeneratedbythe
rows of
k;0
. This is not exactly true because of truncation, however we expect that the
numberofdominantsingularvaluesoftheN
c
QDcombinedmatrix[ T
k;0
;:::; T
k;N
c 1
] T
is Q, so that the rank 0
of is KQ. This yields a MSE reduction factor of about
1=N
c
. Since N
c
2,the improvement inthe estimation MSEis about 3 dB orlarger.
Fig.3shows the squaredsingularvaluesof[ T
k;0
;:::; T
k;Nc 1 ]
T
forN
c
=2and4,Q=
32and RRCchip-shapingpulse with roll-o =0:22 truncated over asupportof=12
values. In the results of Section 5 we use a threshold criterion for rank determination.
Namely, the rank is estimated as the largest index for which the corresponding squared
singular value is not smaller than 10 3
times the sum of all squared singular values. In
the example ofFig. 3this criterionyieldsanestimated rankequalto 33,inagreementto
what anticipatedby the above intuitive argument.
The MSE reduction provided by Type II structured estimation (with perfect delay
knowledge) is easilyevaluated by noticing that the rank of is not larger than the sum
ofalluserpaths,i.e., 00
KP. Then,thereductionfactorprovidedbyTypeIIestimation
is at least P=(N
c
Q). Forchannels with a smallnumber of dominant paths much smaller
thanthedelay-spread(expressedinchips),suchasinmostETSImodels[29],thepotential
gain ofTypeIIestimation islarge. However, thisdependscriticallyonthechannelmodel
considered and onthe ability of estimating the delays of the dominant pathsof allusers.
4.5 Delay estimation
In this sectionwepropose apathdelayestimatorderived fromthe MLestimation of
=
f
k;p
: k = 1;:::;K; p =0;:::;P 1g fromthe observation bg, given by the estimated
discrete-timelow-pass lteredchannelimpulseresponsesprovided eitherby unstructured
or by Type I structured estimation. This method can be used in the preliminary delay
estimation of the Type II structured channel estimatordescribed inSection4.3.
Consider the preliminarychannel estimategb obtained either by the unstructured es-
timator of Section3 or by the Type I structured estimator. For both estimators we can
write
b
g =g+e=c+e (29)
where e N
C
(0;),independent of the vector of channel gains c. We assume Rayleigh
fading, so that c is also N
C
(0;), where the covariance matrix denes the average
energy for each path (delay-intensity prole [5])) and the correlation between the path
gains. Forgiven delay vector, isxed and bg isconditionallyN
C (0;R
g
()),where
the conditionalcovariancematrix is given by
R
g
()=() () H
+
Withthe abovestatisticalmodel, the MLestimateof based onthe observation bg is
given by
b
=arg max
b
g H
R 1
g (
)b
g logDet(R
g
()) (30)
This requiresthe maximizationof thelog-likelihoodfunctionoveraKP-dimensionalreal
space. Moreover, the log-likelihoodfunction depends on via a matrix inversion.
Inordertodecreasecomplexity,wemakesomesimplicationsand assumptions. First,
we assume a white error vector (i.e., = 2
e
I). This assumption holds exactly for the
unstructured estimator of Section 3 and approximately for the structured Type I esti-
mator if an optimal training sequence set is used. Next, we assume that the channel
gains for dierent users are mutually independent and that each user channel obeys the
uncorrelated-scattering model[5], so that E[c
k;p c
k 0
;p 0]=
2
k;p Æ
k;k 0Æ
p;p 0
. Then, aftersuitable
reordering ofthe componentsof b
g andby exploitingthe block-matrixstructure of,the
ML joint delayestimator(30) reduces toK independent MLestimators for the
k 's(the
delays of user k),given by
b
k
=arg max
k n
b g
(k)
H
R 1
k (
k )bg
(k)
logDet(R
k (
k ))
o
(31)
where
b g
(k)
=(bg
k;0
[0];:::;bg
k;N
c 1
[0];bg
k;0
[1];:::;bg
k;N
c 1
[1];:::;bg
k;0
[Q 1];:::;bg
k;N
c 1
[Q 1]) T
where
R
k (
k )=
P 1
X
p=0
2
k;p k;p
H
k;p +
2
e
I (32)
and where thevector
k;p
contains the samplesofthe chip-shapingpulse delayed by
k;p .
Namely,the i-th elementof
k;p
is given by
[
k;p ]
i
= (i=W
k;p )=
p
W (33)
for i=0;:::;QN
c 1.
The solution of (31) still requires maximizationover a P-dimensional real space. In
order toobtainanadditionalsimplication,weassumethat the delays
k
are suÆciently
separated, i.e., that min
p6=p 0
j
k;p
k;p 0j
>T
c
. Since the support of (t)has limitedsize
T
c
, the vectors
k;p and
k;p 0
have disjointsupport, i.e., their non-zero components are
indierentpositions. Therefore, H
0
=Æ
p;p 0
(recallthat, because ofthe bandlimited
assumption, j
k;p j
2
= 1
W P
QN
c 1
i=0
j (i=W
k;p )j
2
' R
j (t)j 2
dt = 1). Under this \sep-
arated delay" assumption, by applying the matrix inversion lemma [25] to (32) we can
write
R 1
k (
k )=
1
2
e
"
I P 1
X
p=0
2
k;p
2
e +
2
k;p k;p
H
k;p
#
(34)
Also, because of the disjoint support, R
k (
k
) is block diagonal with P blocks of size
N
c
N
c
with the form
2
k;p vv
H
+ 2
e
I (35)
where v contains the N
c
non-zero samples of (t). Since in general PN
c
QN
c ,
R
k (
k
) contains alsosome other diagonalblocks with diagonal elements allequal to 2
e .
Hence, Det(R
k (
k
)) isgiven by
Det(R
k (
k
)) =
2QNc
e
P 1
Y
p=0 Det
2
k;p
2
e vv
H
+I
=
2QNc
e
P 1
Y
p=0
2
k;p
2
e +1
(36)
and it is independent of
k
. By neglecting the terms independent of
k
, the following
approximated MLestimator isobtained:
b
k
=arg max
k P 1
X
p=0
2
k;p
2
e +
2
k;p
H
k;p b g
(k)
2
(37)
Wehavenotescaped aP dimensionalmaximizationandtypicallyindependentmaximiza-
tion of each term in (37) yieldsb
k;p
=b,for all p=0;:::;P 1,where bislocatedwith
high probability in the vicinity of the maximum peak of the sampled observed channel
response b
g (k)
. Also, we observe that in practice, both the delay-intensity prole and P
are unknown. However, we propose an approximated algorithm which requires only a
sequence of P one-dimensionalmaximizations.
AssumeP known. First,deneadelay discretizationstep, suchthat T
c
= =N
is an integer multiple of N
c
. Then, for all j = 0;:::;QN
1, dene the vectors v
j of
length QN
c
with i-th component
[v
j ]
i
= (i=W j)=
p
W
for i = 0;:::;QN
c
1. Clearly, v
j
=
k;p
if j =
k;p
, for some j, while if
k;p
is not
vectorw
0
=gb (k)
andtheset ofdelayindexesS
0
=f0;:::;QN
1g. Forp=0;:::;P 1,
repeat the following steps:
1. Estimate the p-th delay asb
k;p
= b
j
p
, where
b
j
p
=arg max
j2Sp
v H
j w
p
2
(38)
2. Eliminate the eect of the p-th delay from the observed channel impulse response
as
w
p+1
=w
p v
H
b
j
p w
p
jv
b
j
p j
2
!
v
b
jp
(39)
3. Update the delay index set as
S
p+1
=S
p f
b
j
p N
+1;:::; b
j
p +N
1g (40)
Remark: algorithminterpretation. Ifthe\separateddelay"assumptionholds,we
couldmaximize(37)bysearchingoverallpossiblecombinationsofsupportsfS
0
;:::;S
P 1 g
of size kT
c
, such that S
p
[0;QT
c
] and S
p
\S
p 0
= ;, and by maximizing each term
j H
k;p b g
(k)
j 2
with respect to
k;p 2 S
p
. While the \separated delay" assumption is instru-
mental for reducing the likelihood function to a manageable form, it is not generally
satised. We argue that as the delays are more and more separated, the true likelihood
function is closer and closer to the approximation (37). Obviously, if some paths are
removed from the channel response, the separation between the delays of the remaining
paths is increased and the likelihoodfunction for these delays is better approximated by
(37). Then, it is meaningfulto look for the delays in order, starting fromthe path with
the largest gain, and remove it at each p-th iteration the path corresponding to the p-th
estimated delay by projecting the observation vector ontothe orthogonalcomplementof
v
b
jp
. Allinnerproductsv H
j w
p
in(38)canbecomputedbyconvolvingw
p
withapolyphase
version of the chip-matched lter (t), sampledat frequency N
=T
c
. Notice that N
can
bemuchhigherthanN
c
,sothatthetimingresolutionofthedelayestimatorisnotlimited
by the receiversamplingfrequency. Foreachp, weestimate
k;p
asthe multipleof for
whichthe outputof the polyphasechip-matchedlter has maximumsquared magnitude.
Becauseofnoiseandnitetimingresolution,thecancellationofthep-thpatheect isnot
perfect. If thep-th pathgain ismuchlarger thanthe remainingpath gains,the energyof
the p-th pathafterimperfectcancellationmightbestilllargerthanthe energy ofremain-
ingpaths. Thispreventsthedetection ofotherpaths. Weobservethat,forincorrectpath
cancellation, the delay estimator for
k;p+1
typically nds a non-existent delay (outlier)
veryclose toone of the previously estimateddelays b
k;0
;:::;b
k;p
. In orderto prevent this
eect, in(40) weeliminatefromthe set of alloweddelay indicesthe indicesdieringfrom
b
j
p
by less than N
. In this way, we forcethe delay estimatorto nd delays separated by
at least one chip interval. Notice that alsoa conventional rake receiver cannot individu-
ally track delays separated by less than one chip. In fact, since the channel is convolved
with (t),whichhasbandwidthonlyslightlylargerthan1=(2T
c
),individualestimationof
delays separated by less than one chip inessence violatesthe Nyquist samplingtheorem.
In practice, P is not known a priori. We can either x an arbitrary value for the
maximumnumber b
P ofpathsperuser(inthiscase,wemaymisssignicantcomponentsfor
someusers whilewewastecomplexityforotherusers)orwecanintroduceastoppingrule
inthedelaysearch. Namely,afterremovingthep-thestimatedpath,ifjw
p+1 j
2
=jw
0 j
2
,
delay (p+1)is searched, otherwise the algorithmis terminatedand the number of paths
for user k is set to p. The threshold should be designed according to the estimation
MSE of the preliminary estimate bg, which is known a priori. Basically, we should stop
the delaysearch whenw
p+1
containsonlythe estimationnoise,plus someextra noisedue
to imperfect cancellationof the alreadyestimated paths.
5 Performance with linear detectors
In order to investigate the impact of channel estimation on the receiver performance,
we consider simple FIR (or \one-shot") linear receivers [2, 22]. After low-pass ltering
and sampling at rate W, the samples of the received signal are input to a shift-register
of length M
1 +M
2
+1, for suitable integers M
1
and M
2
. For each m-th symbol the
register contains the window of samples [mLN
c M
1
;mLN
c +M
2
] (referred to as the
receiver \processing window") centered around the m-th symbol interval. The content
of the receiver processingwindowis used toproduce soft estimates of the users symbols,
which are sent to the bank of single-user decoders. Since the channel delay-spread can
be larger than the symbol interval, the processing window should span more than one
symbolinterval,i.e., M +M +1LN [30,31]. Aftersome straightforwardbut tedious
algebra (see [32, 30] for the details), we can write the content of the receiver processing
windowcorresponding tothe m-thsymbolas the vector
r[m] = K
X
k=1 B
2
X
n= B1 s
k
[m;n]b
k
[m n]+[m] (41)
where [m]N
C (0;N
0
I)and wherethe signalvectorss
k
[m;n] are given by the convolu-
tionofthesegmentofthek-thchipsequence correspondingtothe(m n)-thsymbolwith
the k-thdiscrete-timeoverall channel response. Forperiodicspreading, s
k
[m;n] doesnot
dependonm [33]. Thevectorchannelmodel(41)takesintoaccountboth multiple-access
and inter-symbol interference (MAI and ISI) [32]. In fact, each user contributes to r[m]
with B
1 +B
2
+1 symbols. The integersB
1
and B
2
are relatedtotheprocessing windows
boundaries M
1
and M
2
by [30]
B
1
=
M
1 +N
c
(L+Q 1) 1
N
c L
B
2
=
M
2
N
c L
(42)
The receivermakes use ofthe channelestimates obtained by one of the proposed estima-
tors and computes the signal vectors bs
k
[m;n] by convolving the estimated k-th channel
response with the appropriatesegment of the k-th chip sequence. In the case of periodic
spreading, bs
k
[m;n] can be computed once and used for the whole block. For aperiodic
spreading s
k
[m;n] must be computed for every symbol intervalin the block.
Without loss of generality, we focus on the linear detection of symbol b
1
[m]. A soft
estimateof b
1
[m] isobtained by the linear lteringoperationz
1
[m]=h
1 [m]
H
r[m], where
h
1
[m] is the vector of lter coeÆcients for the m-th symbol of user 1. The SINR at the
lter output isgiven by [15]
SINR
1 [m]=
jh
1 [m]
H
s
1 [m;0]j
2
N
0 jh
1 [m]j
2
+ P
K
k=1 P
B
2
n= B
1 jh
1 [m]
H
s
k [m;n]j
2
jh
1 [m]
H
s
1 [m;0]j
2
(43)
Here, we consider SUMF and LMMSE ltering [22]. Approximations of the SUMF and
of the LMMSElters are obtained fromthe vectorsbs
k
[m;n] as
b
h
1;sumf
[m] = bs
1 [m;0]
b
h
1;mmse
[m] = b
R 1
r [m]bs
1
[m;0] (44)
where
b
R
r [m]=
K
X
k=1 B2
X
n= B
1 bs
k
[m;n]bs
k [m;n]
H
+N
0 I
is the covariance matrix of r[m] assumingbs
k
[m;n] =s
k [m;n].
We consider a system with K = 8 users, spreading gain L = 16, RRC chip-shaping
pulse with roll-o = 0:22, truncated over =12 chips, and maximum channel delay-
spreadof=20T
c
. Withthisdelay-spread,user blocksmaybemisalignedbymorethan
onesymbolandISIisnotatallnegligible,incontrasttowhatisnormallyassumedinmost
DS/CDMAliterature. ThemaximumchannellengthisQ=20+12=32chips. Wechoose
traininglengthM =256, i.e.,the minimumlength necessary forestimating 8channelsof
length 32. For each block and for each user k, a random channel impulse response c
k (t)
is generated according to the model (1), where the number of paths P is random and
uniformly distributed over the integers f1;:::;6g, the delays
k;p
are independently and
uniformlydistributed on [0;] and the gains c
k;p
are independent N
C (0;
2
k;p
), where
2
k;p
=A
k exp
k;p
min;k
max;k
min;k
(
min;k and
max;k
denotetheminimumandthemaximumdelaysofuserk). Theamplitude
factor A
k
is chosen such that P
P 1
p=0
2
k;p
=E
k
,where E
k
=N
0
isthe received average SNR
of user k.
AsillustratedintheIntroduction, thesystem spectraleÆciency (assumingsingle-user
decoding) is determined by the SINR cdf. We use Monte Carlo simulation to evaluate
the cdfF
sinr
()=Pr(SINR
1
[m]) over5000independent slots. Figs. 4and 5showthe
SINR cdf for the SUMF receiver in the case of slowand fast power control, respectively.
In the case of slowpowercontrol, E
k
=N
0
is set to10 dBfor allusers withoutany further
normalization of the path gains. Then, the instantaneous user received power uctuates
considerably from block to block, because of the uncompensated Rayleighfading. In the
case of fast power control, E
k
=N
0
is set as before but the channel gains are normalized
such that P
P 1
p=0 jc
k;p j
2
=E
k
. This could be obtainedin practice by a TDDsystem where
each user measures the signal powerreceived on the downlink and exploits reciprocity in
order to compensate instantaneously for the Rayleigh fading block by block. The SINR
cdf yieldsthe block outageprobability,dened asthe probabilitythat the SINRis below
a xed threshold . The horizontal dashed line corresponds to F
sinr
() = 10 1
. For
1
unstructured estimator, about 0.6 dB for the Type I structured estimator and 0.2 dB
for the Type II structured estimator(the latter employs the delay estimatordescribed in
Section4.5, with axednumberofpaths peruser b
P =4). Forcomparison,the SINR cdf
for Type II estimation with perfect knowledge of the delays is shown. The loss between
perfect and imperfect delay knowledge isabout 0.1dB.
Figs. 6 and 7 show analogous results for the LMMSE receiver. In this case, for a
10 1
outage probability, non-ideal channel knowledge incurs about a 4.5 dB loss for the
unstructured estimator,about 2.6dB for the TypeI structured estimatorand 1.0dB for
the Type II structured estimator. The losscaused by imperfect delay knowledge for the
Type II estimatorisabout 0.4 dB.
Finally,Figs. 8and9showresults forthe LMMSEreceiverassumingimbalanceduser
powers. Namely, we assume that E
k
=N
0
=10dB for users k = 1;:::;4 and E
k
=N
0
=20
dB for users k =5;:::;8. Slow and fast power controlare considered, respectively. The
imbalanced-power case ismotivated here by the possibilityof accommodatingusers with
dierent service requirements. In this case, the performance of all estimators is almost
unchanged with respect to the case of imbalanced powers, with the dierence that the
degradation suered from imperfect delay knowledge in Type II estimation is slightly
increased (about 0.6 dB). Performance for the SUMF in the imbalanced-power scenario
isnot shown since theresultingSINR isverylow, and thereislimitedutility inusingthe
SUMF forlow-powerusers in the presence of high-power users [22].
Some comments are in order: 1) the estimators presented in this paper are almost
insensitivetouser powerimbalance(although strictly-speakingTypeII estimation isnot
near-far resistant, because of the bias introduced by non-ideal delay estimation). This
is due to the fact that, since the training matrix A has full column-rank, the channels
are estimatedinmutuallyorthogonalsubspaces, sothatusers withhigher received power
do not create larger interference for the estimation of lowerpower users. In comparison,
conventional rake-based channelestimation isnot robust to largeuser power dierences.
2) The slight degradation of Type II estimation with respect to the case of balanced
powers can be explained by the fact that only a xed number of paths b
P per user are
takenintoaccount. Ifauser hasmorethan b
P paths, theeect ofthesepathsisnot taken
into account in the approximated LMMSE lter (44). For high-power interfering users,
the eect of neglected paths is more evident. However, the degradation shown in our
simulations is very small, and itcould be reduced further by allowing avariable number
of estimated paths peruser as explainedat the end of Section 4.5.
3)Conventional unstructured estimationprovesto be suÆciently goodfor SUMF de-
tection. On the contrary, for LMMSE detection is makes sense to consider improved
channel estimation such as the structured Type I and Type II schemes proposed here.
In fact, it is well-known that in a multipath environment channel estimation errors have
a much larger impact on LMMSE than on SUMF, especially for large SNR (see [34]).
Intuitively, we can explain this fact by considering that the LMMSE for large SNR is
similar to the decorrelator [22], which projects the received signal onto the orthogonal
complement of the subspace spanned by interference. Non-perfect knowledge of the in-
terference subspace caused by estimation errors yields a large SINR degradation. On
the other hand, the SUMF is insensitive to the knowledge of the interference subspace,
since it treats interference as white noise, and SINR degradation is caused only by the
non-perfectknowledgeofthe usefulsignal. Notice thatthiseect cannotbeobserved just
by consideringthe estimation MSEas performance measure.
6 Conclusions
In this paper we have considered training sequence-based joint channel estimation for a
block-synchronous chip-asynchronous system motivated by the T-CDMA/TDD proposal
of3rdgenerationsystems. Wereviewedtheunstructuredchannelestimatorcurrentlypro-
posed for T-CDMA/TDD and we derived new structured channel estimators, exploiting
variouslevelsofa prioriinformation. Theproposedstructured channelestimatorscan be
applied to any arbitrary (approximately bandlimited) chip-shaping pulse, as opposed to
most methodsthat assume rectangular pulses.
The structured channel estimator that exploits the ne structure of the multipath
channels, i.e., the number of paths and their delays, requires accurate delay estimation.
Then, startingfrom the ML criterion, we derived a low-complexity delay estimatorthat
provesto be suited forstructured channelestimation.
Forallmethods, we found optimalsets of training sequences, existing for any desired
training length. The same sequence set is optimal for both unstructured and structured
estimation. If these sequences are adopted in standardization, it will be up to the de-
signer of the base-station equipment to implement the lower complexity unstructured
estimation orthe higher complexity structured estimation, according to the desired per-
formance/complexity trade-o.
Wecompared the proposed channel estimatorsby lookingatthe outputSINR ofsim-
ple linear detectors like the SUMF and the LMMSE receivers. Our simulations show
that whileunstructured estimationmightbesuÆciently goodforthe SUMF,itslosswith
respect to ideal channel knowledge might be too large in the case of LMMSE. On the
contrary, the performance lossofstructured estimation issmall alsowithLMMSE detec-
tion. Even though not considered here, we expect that alsoparallelor serial interference
cancellationschemes [22] are sensitive tochannelestimation errors,since imperfectchan-
nel knowledge prevents complete interference cancellation. Then, the higher complexity
of structured channelestimators proposed inthis papermightbe motivated andjustied
for receivers employingmultiuser detection.
Acknowledgements
Theauthorswouldliketoacknowledgethefruitfulandfriendlydiscussionswithprofessors
DirkSlock and Pierre Humblet.
References
[1] 3GPP {TS25.105V3.1.0,\3GPP-TSG-RAN-WG4;UTRA(BS)TDD;RadioTrans-
mission and Reception," ETSI (Availableat www.etsi.org/umts), Dec. 1999.
[2] A. Klein, G. Kawas Kaleh, P. Baier,\Zero forcing and minimum mean-square error
equalizationformultiuserdetectionincode-divisionmultiple-accesschannels,"IEEE
Trans. on Vehic. Tech., Vol. 45,No. 2,pp. 276 {287, May 1996.
[3] B. Steiner,P.Jung,\Optimumandsuboptimum channelestimationfortheuplinkof
CDMA mobile radio systems with joint detection," European Trans. on Commun.,
Vol.5, No. 1,pp. 39 { 49,Jan.{Feb., 1994.
[4] T. Rappaport, Wireless Communications.EnglewoodClis,NJ:Prentice-Hall, 1996.
[5] J. G.Proakis, Digital Communications.New York,NY: McGrawHill, 3rd ed., 1995.
[6] S. Crozier, D. Falconer,S. Mahmoud, \Least sum of squared errors(LSSE) channel
[7] A. Clark, Z. Zhu, J. Joshi, \Fast start-up channel estimation," IEE Proc.- F, vol.
131, pp. 375 { 382, July1984.
[8] S.Bensley,B.Aazhang,\Subspace-based channelestimationforCode DivisionMul-
tiple Accesscommunicationsystems," IEEETrans.onCommun., Vol.44,No.8,pp.
1009 { 1020, Aug.1998.
[9] S. Bensley, B. Aazhang, \Maximum-Likelihood synchronization of a single user for
Code DivisionMultipleAccess communicationsystems," IEEETrans.onCommun.,
Vol.46, No. 3,pp. 392 { 399, March 1996.
[10] E. Strom, S. Parkvall, S. Miller, B. Ottersten, \Propagation delay estimation in
asynchronousDirect-SequenceCode-DivisionMultiple-Accesssystems,"IEEETrans.
on Commun., Vol. 44,No. 1,pp. 84 {93, Jan. 1996.
[11] E.Ertin,U.Mitra,S.Siwamogsatham,\Maximum-Likelihoodbasedmultipathchan-
nel estimation for code-division multiple-access systems," submitted to the IEEE
Trans. on Commun.,June 1998.
[12] B. C. Ng,M. Cedervall,A. Paulraj, \Astructured channel estimation for maximum
likelihoodsequence detection," IEEE Commun. Letters,pp. 52{ 55,March 1997.
[13] B. C. Ng, D. Gesbert, A. Paulraj, \A semi-blind approach to structured channel
equalization," Proc. of IEEE Intern. Conf. on Commun. (ICC) 1997, pp. 3385 {
3388, Atlanta,7 {11 June, 1998.
[14] W.Mow,Sequencedesignforspreadspectrum,TheChineseUniversityPress,Chinese
University of Hong Kong, Shatin,Hong Kong,1995.
[15] D. Tse and S. V.Hanly, \Linearmultiuserreceivers: Eective interference, eective
bandwidth andcapacity,"IEEE Trans.Inform. Theory, Vol.45,No.2, pp.641{657,
March 1999.
[16] S. Verduand S.Shamai(Shitz), \Spectral eÆciency of CDMA with randomspread-
ing," IEEE Trans.Inform. Theory,Vol.45, No.2, pp. 622{640,March 1999.
[17] S. Verdu and S. Shamai (Shitz), \The eect of frequency-at fading on the spec-
Also: \Capacity of CDMA fadingchannels," 1999 IEEE Inform. Theory Workshop,
Metsovo, Greece,June 27-July 1,1999.
[18] E. Biglieri, G. Caire, G. Taricco and E. Viterbo, \How fading aects CDMA: an
asymptotic analysis with linear receivers," to appear on IEEE Journ. Sel. Areas on
Commun. (wireless series),2000.
[19] E.Biglieri,J.ProakisandS.Shamai(Shitz),\Fadingchannels: Information-theoretic
and communications aspects," IEEE Trans. on Inform. Theory, Vol. 44, No. 6, pp.
2619{2692, Oct. 1998.
[20] C. Berrou and A. Glavieux, \Near optimum error-correcting coding and decoding:
Turbo codes," IEEE Trans.on Commun., Vol. 44,No. 10,Oct. 1996.
[21] T.RichardsonandR.Urbanke,\Thecapacityoflow-densityparitycheckcodesunder
message passing decoding," submitted toIEEE Trans. on Inform. Theory,1998.
[22] S. Verdu, Multiuser Detection. New York, NY: Cambridge University Press, 1998.
[23] P. Lancaster and M. Tismenetsky, The Theory of Matrices, 2nd Edition, New York:
Academic Press, 1985.
[24] 3GGP { TS 25.224 V3.1.0,\3GPP-TSG-RAN-WG1; Physical Layer Procedures
(TDD)," ETSI (Available atwww.etsi.org/umts), Dec. 1999.
[25] S. Kay, Fundamentals of statistical signal processing: Estimation theory. New York:
Prentice-Hall, 1993.
[26] S. Grant, J. Cavers, \Multiuser channel estimation for detection of cochannel sig-
nals," Proceedings of 1999 IEEE Intern. Conf. on Commun. ICC '99, Vancouver,
BC, Canada, 6 { 10June, 1999.
[27] J. Ng,K.BenLetaief, R.Murch,\Complexoptimalsequences with constantmagni-
tude for fast channel estimation initialization," IEEE Trans. on Commun., Vol. 46,
No. 3, pp. 305 { 308, March1998.
[28] G. Golub, C. Van Loan, Matrix computations. The John Hopkins University Press,
2nd ed., 1989.
[29] COST207 ManagementCommittee,\COST 207: Digitallandmobileradio commu-
nications (nal report)," Commission of the European Communities, 1989.
[30] G. Caire, \Two-stage non-data aided adaptive linear receivers for DS/CDMA," to
appear on IEEE Trans. on Commun., Oct. 2000.
[31] R. De Gaudenzi, F. Giannetti, M. Luise, \Design of a low-complexity adaptive
interference-mitigatingdetectorforDS/SSreceiversinCDMAradionetworks,"IEEE
Trans. Commun., Vol. 46,pp. 125 {134, Jan. 1998.
[32] X. Wang and V. Poor, \Blind equalization and multiuser detection in dispersive
CDMA channels," IEEE Trans. onCommun., Vol.46,No.1,pp. 91{103,Jan. 1998.
[33] A. J. Viterbi, CDMA { Principles of Spread Spectrum Communication. Reading,
MA: Addison-Wesley, 1995.
[34] J. Evans and D. Tse, \Large system performance of linear multiuser receivers in
multipath fading channels," to appear on IEEE Trans. on Inform. Theory, 2000.
Also presented at Intern. Sympos. on Inform. Theory ISIT 2000, Sorrento, Italy,
June 2000.
+
Ψ (t) c (t)
User 1
1
Ψ (t) c (t)
User 2
2
Ψ (t) c (t)
User K
K
Σ
ν (t)
LPF
Sampling at rate W
Figure1: Baseband equivalent representation of a DS/CDMA system.
Transmitted user signal block
Data Training Data
Q-1 chips
M chips
Total delay-spread: Q chips User delay
Convolution with the channel
Received user signal block
Data Samples for channel estimation Data
Q-1 chips
M chips
Guard
Figure 2: Signal block structure beforeand after convolution with the channel of length
at most Q chips.
-6 -5 -4 -3 -2 -1 0 1
0 20 40 60 80 100 120 140
Singular values (log-scale)
Singular value index Q=32, RRC pulses, α=0.22
N c =2 N c =4
Figure 3: Normalizedsquared singularvaluesfor the chip-shapingpulse convolutionma-
trix, for Q=32, N
c
=2;4, and RRC pulse with roll-o=0:22.
0 0.2 0.4 0.6 0.8 1
-15 -10 -5 0 5 10
SINR cdf
SINR (dB)
K=8, L=16, M=256, slow power control, SUMF Ideal
Unstr Str.Type I Str.Type II,perf.
Str.Type II,est.
Figure 4: SINR cdf with slow power control and SUMF receiver, with dierent channel
information: Idealknowledge(Ideal),Unstructuredestimation(Unstr.),TypeIstructured
estimation (Str.Type I), Type II structured estimation with perfect delays (Str.Type II,
perf.) and with estimated delays ,(Str. Type II, est.).
0 0.2 0.4 0.6 0.8 1
-2 -1 0 1 2 3 4 5 6 7
SINR cdf
SINR (dB)
K=8, L=16, M=256, fast power control, SUMF Ideal
Unstr Str.Type I Str.Type II,perf.
Str.Type II,est.
Figure 5: SINR cdf with fast power control and SUMF receiver, with dierent channel
information: Idealknowledge(Ideal),Unstructuredestimation(Unstr.),TypeIstructured
estimation (Str.Type I), Type II structured estimation with perfect delays (Str.Type II,
perf.) and with estimated delays ,(Str. Type II, est.).