Comparison between unitary and causal approaches to backward adaptative transform coding of vectorial signals

(1)

COMPARISON BETWEEN UNITARY AND CAUSAL APPROACHES TO BACKWARD ADAPTIVE TRANSFORM CODING OF VECTORIAL SIGNALS

David Mary, Dirk T.M. Slock

Institut Eur´ecom, 2229 route des Crˆetes, B.P. 193, 06904 Sophia Antipolis Cedex, FRANCE Tel: +33 4 9300 2916/2606; Fax: +33 4 9300 2627

e-mail:

^f

mary,slock

^g

@eurecom.fr

ABSTRACT

In a transform coding framework, we compare the optimal causal approach (LDU, Lower-Diagonal-Upper) to the optimal unitary approach (Karhunen-Loeve Transform, KLT). The criterion of merit used for this comparison is the coding gain, defined for a transformation^T as the ratio of the average distortion obtained with the identity transformation over the average distortion obtained with

T. Both transforms are known to yield the same gain when they are computed on the signal covariance matrix^R. The purpose of this paper is to compare the behavior of these two transformations when the ideal transform coding scheme gets perturbed, that is, when only an estimate^R⁺^Rof^Ris known. In this case, not only the transformation itself will be perturbated, but also the bit allocation mechanism. We compare the two approaches in two cases. Firstly,^Ris caused by a quantization noise : the coding scheme is based on the statistics of the quantized data. We find that the coding gain in the unitary case is higher than in the causal case. In a second case,^Rcorresponds to an estimation noise : the coding scheme is based on an estimate of^Rbased on a finite amount of available data. In this case, both causal and unitary approaches are strictly equivalent, because of the unimodularity and decorrelating properties of the transformations. Simulations results confirming the predicted behavior of the coding gains with perturbations are reported.

1. INTRODUCTION

Consider a stationary Gaussian vectorial source^fX^g. This source may be composed of any scalar sources^fxi

g, for example audio signals. In the classical transform coding framework, a lin- ear transformation^T is applied to each N-vector^Xto produce an N-vector^Y ⁼^T^Xwhose components^yⁱare independently quantized using a scalar quantizer^Qi. A number of bits^riis attributed to each^Qi under the constraint

P

i r

i

= Nr. For an entropy constrained scalar quantizer of a Gaussian source, the high resolution distortion is E^(yi^q

(k );yi(k )) 2

= 2

q

i

= c2

;2r

2

y

i

, where

c= e

6

.

An important property of commonly used transformations is that, if a noise (for example quantization noise) is added to the signal^Y, then its power will be the same in the transform and in the signal domains. This property is sometimes referred to as ”unity noise Eur´ecom’s research is partially supported by its industrial partners:

Ascom, Swisscom, Thales Communications, STMicroelectron ics, CEGE- TEL, Motorola, France T´el´ecom, Bouygues Telecom, Hitachi Europe Ltd.

and Texas Instruments.

gain” property [3]. The coding gain for^Tis then defined as

G

T

= Ek

~

Xk 2

(I)

Ek

~

Xk 2

(T)

= Ek

~

Xk 2

(I)

Ek

~

Yk 2

(T)

(1)

where^Iis the identity matrix, and the notation^k^Xk^~ ²(T)denotes the variance of the quantization error on the vector^X, obtained for a transformation^T. The optimal bit allocation yields the well known distortion for the vectorial signal^fY^g: E^jj^Y^~^jj²T

= 1

N P

N

i=1

2

qi

=

Nc2

;2r

Q

N

i=1

2

y

i 1

N

=N 2

q.²q i

is independent ofⁱ, and the number of bits assigned to the i^thcomponent is^r⁺¹

2 l og2

2

y

i

Q

N

i=1

2

y

i

1

N

. In the next section, we recall the main characteristics of the optimal causal approach (LDU) when optimized on^R, and summarize the reasons why its performance is the same as the best unitary approach (KLT).

However, a backward adaptive coding scheme requires that nei- ther the transformation nor the parameters of the bit allocation are transmitted to the decoder. So suppose now that the coding scheme is based on^RXX^b ⁼^R⁺^Rinstead of^R, where^RXX^b is available at both encoder and decoder. Then the computed transformation will be^T^{^}⁼^T⁺^T, and the distortion will be proportional to the variances of the signals transformed by means of^T^{^}instead of^T, say²y⁰

i

. Moreover, the bits^ri should be attributed on the basis of estimates of the variances available at both encoder and decoder also, that is,⁽^T^{^}^RXX^b ^T^{^}⁾ⁱⁱ, where^(:)iidenotes the i^thdi- agonal element of^(:). Hence, we get the following measure of distortion for a transformation^T^{^}based on^R^bXX:

Ek

~

Yk 2

(

^

T)

=E N

X

i=1 c2

;2r + 1

2 log

2 (

^

T b

R

XX

^

T T

)

ii

( Q

N

i=1 (

^

T b

R

XX

^

T T

)

ii )

1

N ]

2

0

y

i : (2) where the expectation is w.r.t^Rin case it is non-deterministic. In the third section,we compare this distortion when^Ris caused by a quantization noise : the coding scheme is based on the statistics of the quantized data, under high resolution assumption. In the fourth part,^Rcorresponds to an estimation noise : the coding scheme is based on an estimate of^Rdue to a finite amount of^K vectors :^R^bXX

= 1

K P

K

i=1 X

i X

T

i . The fifth part is dedicated to simulation results.

2. OPTIMAL CAUSAL AND UNITARY APPROACHES WITHOUT PERTURBATION

In the causal case,^Y ⁼^LX ⁼^X^;^{L X}, where^LXis the ref- erence vector. The output^X^qis^Y^q⁺^LX. Note that the recon-

(2)

struction error^X^~equals the quantization error^Y^~ :

~

X=X;X q

=X;LX;Y q

=Y ;Y q

=

~

Y (3) as in the unitary case. As detailed in [2], the optimal^Lin terms of coding gain is such that^LRXX

L T

=diag f 2

y

1 :::

2

y

N gwhere

diag f:::grepresents a diagonal matrix whose elements are²y i

. In other words, the components^yiare the prediction errors of^xi

with respect to the past values of^X, the^X1:i;1, and the optimal coefficients^;L^i1:i;1are the optimal prediction coefficients.

Since each prediction error^yiis orthogonal to the subspaces gen- erated by the^X1:i;1, the^yiare orthogonal. It follows that^RXX⁼

L

;1

R

YY L

;T, which represents the LDU factorization of^RXX. Referring to (1), the coding gain without perturbation for the optimal causal transform can be written as

G (0)

L

= Ejj

~

Yjj 2

I

Ejj

~

Xjj 2

L

= Ejj

~

Xjj 2

I

Ejj

~

Yjj 2

L

=

det diag (R

XX )]

det diag (LR

XX L

T

)]

1

N

(4) where^{diag (R)}denotes here the diagonal matrix that corresponds to the diagonal of the matrix^R. Now, using the unimodulariry of

L,^{det(diag (}^RY^Y⁾⁾⁼^{det (}^RXX)⁼^det, whereis the eigenvalue matrix of^R^XX. The coding gain is

G (0)

L

=

det diag (RXX)]

det diag (LR

XX L

T

)]

1

N

=

det diag (RXX)]

det

1

N

=G (0)

V

(5) where^V denotes a KLT of^R^XX. Thus, for an optimal bit allocation, the coding gains of the KLT and the LDU are the same without perturbation for three reasons : both transformations en- sure that the power of the quantization error is the same in the transform and in the signal domains, they are totally decorrelating transforms, and finally they are unimodular.

3. QUANTIZATION EFFECTS ON THE CODING GAINS Suppose we compute the transformation on the basis of quantized data. The statistics of the quantized data are assumed to be per- fectly known in this section. In other words, we assume that the decoder disposes of an infinite number of quantized vectors^Xi^q, and hence of^RX

q

X

q. Under the assumptions of high resolution (uncorrelated white noise), optimal bit assignment and unity noise gain property of the transformation,^R⁼^E^X^~^X^~^T ⁼²q

I, where²q

= c2

;2r

Q

N

i=1

2

y

i 1

N. Thus, for^T^{^} ⁼^I^V^{^} and^L^{^}, we shall compute (the subscript^qrefers to quantization)

Ek

~

Yk 2

(

^

Tq )

= N

X

i=1 c2

;2r + 1

2 log2

(

^

TR

X q

^

T T

)

ii

( Q

N

i=1 (

^

TR

X q

^

T T

)

ii )

1

N ]

2

0

y

i (6)

3.1. Identity Transformation

In this case, the number of bits attributed to the quantizer^Qiis

r+ 1

2 l og2

(R

X q

X q)ii

( Q

N

i=1 (R

X q

)

ii )

1

N

, and the variance²yi⁰ are indeed

(RXX)ii. Thus

Ek

~

Yk 2

(Iq )

= N

X

i=1 c2

;2r + 1

2 log

2 (R

X q

)

ii

( Q

N

i=1 (R

X q

)

ii )

1

N ]

(R

XX )

ii (7)

= N

X

c2

;2r

(detdiag fRX q

X q

g) 1

N (R

XX )

ii

(RX q

X q)ii

: (8)

One shows that

N

X

i=1 (R

XX )

ii

(R

X q

)

ii

=tr f

;

I+ 2

q (diag R

XX )

;1

g

where^trdenotes the trace operator, and

det(diag RX q

X q)

=det(diag RXX)det (I+

2

q

(diag RXX)

;1

)

and we find

Ek

~

Yk 2

(Iq )

= Ek

~

Yk 2

(I) 1

N

(det(I+ 2

q (diag R

XX )

;1

)) 1

N

tr f

;

I+ 2

q (diag R

XX )

;1

g:

(9) The distortion is slightly increased because the bits allocated on the basis of variances of quantized signals are not the optimal ones (the variance of the quantization noise is not equal in each branchⁱ).

An approximation of (9) up to the second order of the perturbation gives

Ek

~

Yk 2

(Iq )

=c2

;2r

(detdiag fRXXg) 1=N

( N

i=1 (

2

q

(R

XX )

ii ))

1=N P

N

i=1 (1+

1

(R

XX )

ii )

;1

Ek

~

Yk 2

(I) (1+

4

q

N 2

( N;1

2 P

N

i=1 1

(R

XX )

2

ii

; P

N

i=1 P

j >i

1

(R

XX )

ii (R

XX )

j j ))

(10)

3.2. KLT

As observed in [4] also, if^Vdenotes a KLT of^RXX, then^V^(RXX+

2

q I)V

T

=+ 2

q I=

q, and^V is also a KLT of^RXX⁺q² I. Thus, the perturbation term q²

I on^RXX does not change the backward adapted transformation, and the variances of the transformed signals remain unchanged : ²

0

y

i

=i. However, the decoder estimates the variances^(V^RX

q

X q

V T

)

ii

=

i +

2

q, on the basis of which the coder assigns the bits^ri. Thus, the actual distortion is

Ek

~

Yk 2

(Vq )

= N

X

i=1 c2

;2r + 1

2 log

2 (VR

X q

X qV

T

)

ii

( Q

N

i=1 (VR

X q

X qV

T

)

ii )

1

N ]

(VR

XX V

T

)

ii

(11) wich can be computed in a similar way as in 3.1. We find

Ek

~

Yk 2

(Kq )

=Ek

~

Yk 2

(K) 1

N

(det(I+ 2

q (

;1

))) 1

N

tr f

;

I+ 2

q (

;1

)

;1

g:

(12) Again, the increase in distortion comes from the perturbation occuring on the bit allocation mechanism. Up to the second order of perturbation, an expression similar to (10) is

Ek

~

Yk 2

(Kq )

=c2

;2r

(detdiag fRXXg) 1=N

( N

i=1 (

2

q

i ))

1=N P

N

i=1 (1+

1

i )

;1

Ek

~

Yk 2

(K)

1+

4

q

N 2

( N;1

2 P

N

i=1 1

(

i )

2

; P

N

i=1 P

j >i 1

i

j )

(13) The corresponding expression for the coding gain is

G

Kq

=G 0

(det(I+ 2

q (diag R

XX )

;1

)) 1

N

tr f

;

I+ 2

q (diag R

XX )

;1

g

(det(I+ 2

q (

;1

))) 1

N

tr f

;

I+ 2

q (

;1

)

;1

g

(14) whose second order approximation is

GKqG 0

1+

4

q

N 2

( N;1

2 P

N

i=1 (

1

(R

XX )

2

ii

; 1

(

i )

2 )

; P

N

i=1 P

j >i (

1

(R

XX )

ii (R

XX )

j j

; 1

i

j ))]:

(15)

(3)

3.3. LDU

In the causal case, the coder uses a transformation^L⁰ such that

L 0

RX q

X qL

0

T

= R 0

YY. ^R⁰YY is the diagonal matrix of the es- timated variances involved in the bit allocation (^L⁰and^R⁰YY are both available to the decoder). In this case, the difference vector

Y = X;L 0

X

q, the quantization noise is filtered by the rows of^L⁰. Note that^Ek^Xk^~ ²L

0

q still equals^Ek^Y^~^k²L 0

q, since^X^~ ⁼

X q

;X=Y q

+L 0

X q

;X=Y q

;(X;L 0

X q

)=Y q

;Y =

~

Y. One shows that the actual variances of the signals^yiobtained with

L

0are^(L⁰^RX q

X qL

0

T

; 2

q I)

ii[2]. In this case, one finds

Ek

~

Yk 2

(L 0

q )

= P

N

i=1 c2

;2r + 1

2 log

2 (L

0

R

X q

X qL

0

T

)

ii

( Q

N

i=1 (L

0

R

X q

X qL

0

T

)

ii )

1

N ]

(L 0

RX q

X q

L 0

T

; 2

q I)ii

=Ek

~

Yk 2

(L) 1

N

(det(I+ 2

q (

;1

))) 1

N

tr f

I+ 2

q (R

0

;1

YY )

g:

(16) The increase in distortion comes not only from the perturbation occuring on the bit allocation mechanism but also from the filtering of the quantization noise. Up to the first order of perturbation,we obtain

Ek

~

Yk 2

(L 0

q ) Ek

~

Yk 2

(K)

"

1+

2

q

N (

N

X

i=1 1

i

; 1

2

y

i )

#

(17) where^N

]

(RXX)ii

Ek

~

Yk 2

(I) 1+ E

N;1

2N 2

P

N

i=1 (

(R)

ii

(R

XX )

ii )

2

;E 1

N 2

P

i P

j >i (R)

ii

(R

XX )

ii (R)

j j

(R

XX )

j j ]

(21) The expectation of the first term in (21) is

E N;1

2N 2

N

X

i=1

(R)ii

(R

XX )

ii

2

= N;1

2N 2

N

X

i=1

2(RXX) 2

ii

K(R

XX )

2

ii

= N;1

NK :

(22) The expectation of the second term is

E 1

N 2

P

i P

j >i (R)

ii

(R

XX )

ii (R)

j j

(R

XX )

j j

2

k P

i P

j >i (R

XX )

2

ij

(R

XX )

ii (R

XX )

j j

2

k k.

(diag fRXXg) 1=2

RXX(diag fRXXg) 1=2

k 2

F

(23) where^.(A)denotes the stritcly lower triangular matrix made with the strictly lower trianguler part of^A, and^kAk²Fthe squared Frobe- nius norm of^A. If^D⁼^{diag fR}XX

g, we obtain

E 1

N 2

P

i P

j >i (R)

ii

(R

XX )

ii (R)

j j

(R

XX )

j j

1

K (kD

; 1

2

RXXD

; 1

2

k 2

F

;kdiag fD

; 1

2

RXXD

; 1

2

k 2

F )

= 1

K

(tr fRXXD

;1

RXXD

;1

g):

(24) Finally, the expected distortion for Identity with estimation noise is, for sufficiently high^K,

Ek

~

Yk 2

(IK) Ek

~

Yk 2

(I)

1+ 1

K (1;

tr fRXXD

;1

RXXD

;1

N 2

g)

:

(25) 4.2. KLT

In the unitary case, the expected distortion is

Ek

~

Yk 2

(

^

Vk )

=E N

X

i=1 c2

;2r + 1

2 log

2 (

^

V b

R

XX

^

V K

)

ii

( Q

N

i=1 (

^

V b

R

XX

^

V T

)

ii )

1

N ]

(

^

VRXX

^

V T

)ii:

(26) Using the fact that^V^{^}^RXX^b ^V^{^}^T is diagonal, we can write (26) as

Ek

~

Yk 2

(

^

VK)

=Ec2

;2r

det

^

V b

RXX

^

V

1

N N

X

i=1 (

^

VRXX

^

V T

)ii

(

^

V b

R

XX

^

V T

)

ii

(27)

(4)

Because of the unimodularity of^V^{^}, the determinant in (27) may be written as

det

^

V b

RXX

^

V T

= det (RXX+R)

= (detRXX)det (I+R

;1

XX R)

(28) The sum in (27) may be written as

P

N

i=1 (

^

VR

XX

^

V T

)

ii

(

^

V b

R

XX

^

V T

)ii

=tr f(

^

V b

RXX

^

V T

)

; 1

2

^

VRXX

^

V T

(

^

V b

RXX

^

V T

)

; 1

2

g

=tr f(I+R

;1

XX R)

;1

g

(29) Thus, (27) is equivalent to

Ek

~

Yk 2

(

^

VK)

=Ek

~

Yk 2

(K) 1

N

E(det(I+R

;1

XX R))

1

N

tr f

;

I+R

;1

XX R

;1

g:

(30) Using similar developments as in the previous section, the expected distortion for the KLT when the transformation is based on^kvectors is finally, under high resolution assumption

Ek

~

Yk 2

(

^

Vk ) Ek

~

Yk 2

(K)

1+ N;1

2K +

N;1

NK

: (31) The associated coding gain is,where^D⁼^{diag fR}^XXg,

G

^

VK

= Ek

~

Yk 2

(IK)

Ek

~

Yk 2

(

^

VK) G

0

1; 1

KN 2

tr fRD

;1

RD

;1

g;

N;1

2K +

1

NK

:

(32) 4.3. LDU

As stated in the introduction of this section, the expected distortion with^L^{^}computed on^RXX^b is

Ek

~

Yk 2

(

^

LK)

=E P

N

i=1 c2

;2r + 1

2 log

2 (

^

L b

R

XX

^

L T

)

ii

( Q

N

i=1 (

^

L b

R

XX

^

LT)

ii )

1

N ]

(

^

LR

XX

^

L T

)

ii

=Ec2

;2r

det

^

V b

RXX

^

V

1

N

| {z }

=det b

R

XX

=det(R

XX )det(I+R

;1

XX R)

N

X

i=1 (

^

LRXX

^

L T

)ii

(

^

L b

R

XX

^

L T

)

ii

| {z }

=tr f(I+R

;1

XX R)

;1

g

=Ek

~

Yk 2

(

^

VK)

(33) where the equality concerning the determinants comes from the unimodularity of the transformations^L^{^}and^V^{^}. The equality concerning the trace comes from their decorrelating property. Thus, the distortion and coding gain with estimation noise are the same in the causal and the unitary cases, and are given up to the first order in^Kby (30) and (32).

5. SIMULATIONS

For the simulations, we used entropy constrained scalar quantiz- ers^Qi and real Gaussian i.i.d. vectors with covariance matrix

R

XX

= HR

AR1 H

T. ^R^AR1 is the covariance matrix of a first order autoregressive process with normalized correlation coeffi- cient.^His a diagonal matrix whose i^thentry is^(N^;ⁱ⁺¹⁾¹⁼³ (decreasing variances).

In Figure 1, the coding gain with quantization noise is plotted for KLT (upper curves) and LDU (lower curves) with ⁼ ^0:9,

N =4. The theoretical exact expressions are given by (14) and (18), and the approximated expressions by (15) and (19).

The coding gains in presence of estimation noise are compared for LDU and KLT in Figure 2, for^N ⁼⁴and⁼^0:9(mean over

100realizations). The observed behaviors of the transformation corresponds quite well to the theoretically predicted ones for^K a few tens.

2 2.5 3 3.5 4 4.5 5 5.5 6

3.2 3.25 3.3 3.35 3.4 3.45 3.5

Coding Gains with Quantization Noise −N=4 − Variances − AR1 − rho=.9

Rate (bit per sample)

Coding Gain over Identity

G0 Observed G LDU Observed G KLT Theoretical G LDU : exact Theoretical G1KLT : exact Theoretical G1 LDU : approx.

Theoretical G KLT : approx.

Fig. 1. Coding Gains vs rate in bit/sample.

10⁰ 10¹ 10² 10³

0.5 1 1.5 2 2.5 3 3.5

Coding Gain for KLT and LDU with Estimation Noise − rho=0.9 − N=4 − 100 realizations

Number of Vectors

Coding Gain over Identity

G0 Theoretical Gain Observed Gain LDU Observed Gain KLT

Fig. 2. Gains for KLT and LDU with estimation noise

6. REFERENCES

[1] D. Mary, Dirk T.M. Slock, “Causal Transform Coding, Gener- alized MIMO Prediction and Application to Vectorial DPCM Coding of Multichannel Audio,” WASPAA01, New York, USA, October 2001.

[2] D. Mary, Dirk T.M. Slock, “Vectorial DPCM Coding and Ap- plication to Wideband Coding of Speech,” ICASSP 2001, Salt Lake City, USA, May 2001.

[3] S.-M. Phoong, Y.-P. Lin, “Prediction-Based Lower Triangular Transform,” IEEE trans. on Sig. Proc., vol. 48, no. 7, July 2000.

[4] V. Goyal, J. Zhuang, M. Vetterli, “ Transform Coding with Backward Adaptive Updates,” IEEE trans. on Inf. Theory, vol. 46, no. 4, July 2000.