Methods and Algorithms for

(1)

(2)

(3)

Mathematical Methods and Algorithms for

Signal Processing

(4)

(5)

Mathematical Methods and Algorithms for

Signal Processing

Todd

I(.

Moon

Utah State University

Wynn C. Stirling

Brigham Young University

PRENTICE HALL

Upper

Saddle River, N J 07458

(6)

Library of Congress Calaloging-in-Publication Data Moon. Todd K

Mathematical methods and algorithms for signal processing 1 Todd K Moon, Wynn C Stirllng

P cm

Includes bibliographical references and index ISBN 0-201 -361 86-8

I Srgnal procerslng-Mathematics 2 Algonthm'i I Stirl~ng Wynn C I1 Title

TK5102 9 M63 1999 621 382'2'0151-dc21

99-3 1038 CIP Publisher Marsha Horron

Editorial D~rector Tom Robbz~zs

Production editor Brirrney Cori /gun-Mc Elroy Managrng editor VInce O 'Brren

Assistant managing ed~tor Eileen Clurk Art director Kevin Bern]

Cover deslgn Karl Mljajlmu

Manufacturing manager Trudy Pt rclottl

Asslrtant vice presrdent of production and manufactunng David W Riccardr

@

2000 by Prentlce Hall Prentice-Hall, Inc.

Upper Saddle River. New Jersey 07458

All rightc reserked No part of this book may be reproduced rn any form or by any means, wlthout permlssron In wnting from the publisher

The author and publi\her of thls book have used their best efforts In preparing thls book These efforts include the development, research, and testing of the theorres and programs to determine their effect~veness The author and pubhsher make no warranty of any k~nd. expressed or implled, wrth regard to thece programs or the documentation contaned In this book The author and publisher shall not be liable in any event for incidental or consequential damages in connection. or arlslng out of, the furn~shing. performance. or use of these programs

Pnnted in the Unlted States of America 1 0 9 8 7 6 5 4

ISBN 0-201-361 86-8

PRENTICE-H~LL IIUTER~\~A'IIO~AL (UK) LIMITED Londorl PREYTICC-HALL OF 4L1STRAL14 PTY LIMITED, Sydnei PRE\TICE-HALL CANADA INC . Toronto

PREYTICE-HALL HISPA%OAMERICA'UA, S A

.

Mexico PREYTICE-HALL 01- INDIA PRIVATE LIMITFD, N ~ M . D ~ l h i PREYTICI -HALL OF JAPAN. I ~ c T o k ~

PRFNTICF-HA1 1 (SIN(~APORI- j PTE 1-TD Srrzyuporr

EDITOR4 PKFZTICF-HALI I 1 0 BRA\II LIL)A Rlo de J<~rioro

(7)

I Introduction and Foundations

1 Introduction and Foundations

1 . 1 What is signal processing? . . . . . . 1.2 Mathematical topics embraced by signal processing

. . . 1.3 Mathematical models

. . . 1.4 Models for linear systems and signals

. . . 1.4.1 Linear discrete-time models

. . . 1.4.2 Stochastic MA and AR models

. . . 1.4.3 Continuous-time notation

. . . 1.4.4 Issues and applications

. . . 1.4.5 Identification of the modes

1.4.6 Control of the modes . . . 1.5 Adaptive filtering . . . . . . 1.5.1 System identification

1 5.2 Inverse system identification . . . . . . 1 . 5.3 Adaptive predictors

. . . 1 ^.5.4 Interference cancellation

. . . 1.6 Gaussian random variables and random processes

. . . 1.6.1 Conditional Gaussian densities

1.7 Markov and Hidden Markov Models . . . . . . 1.7.1 Markov models

. . . 1.7.2 Hidden Markov models

1.8 Some aspects of proofs . . . . . . 1.8.1 Proof by computation: direct proof

. . .

1 3 . 2 Proof by contradiction

. . .

1.8.3 Proof by induction

. . . 1.9 An application: LFSRs and Massey's algorithm

. . .

1.9.1 Issues and applications of LFSRs

1.9.2 Massey's algorithm . . . . . . 1.9.3 Characterization of LFSR length in Massey's algorithm 1.10 Exercises . . . 1.11 References . . .

I1 Vector Spaces and Linear Algebra

2 Signal Spaces

. . . 2.1 Metric spaces

2.1.1 Some topological terms . . . 2.1.2 Sequences, Cauchy sequences. and completeness . . .

(8)

vi Contents

. . . 2.1.3 Technicalities associated with the L , and L , spaces

Vectorspaces . . . . . . 2.2.1 Linear combinations of vectors

. . . 2.2.2 Linear independence

. . . 2.2.3 Basis and dimension

. . . 2.2.4 Finite-dimensional vector spaces and matrix notation

. . . Norms and normed vector spaces

. . . 2.3.1 Finite-dimensional normed linear spaces

. . . Inner products and inner-product spaces

. . . 2.4.1 Weak convergence

Inducednorrns . . . . . . The Cauchy-Schwarz inequality

. . . Direction of vectors: Orthogonality

. . . Weighted inner products

. . . 2.8.1 Expectation as an inner product

. . . Hilbert and Banach spaces

. . . Orthogonal subspaces

. . . Linear transformations: Range and nullspace

. . . Inner-sum and direct-sum spaces

. . . Projections and orthogonal projections

. . . 2.13.1 Projection matrices

. . . The projection theorem

. . . Orthogonalization of vectors

. . . Some final technicalities for infinite dimensional spaces

. . . Exercises

References . . . 3 Representation and Approximation in Vector Spaces

. . . 3.1 The approxirnation problem in Hilbert space

. . . 3.1. I The Gram~nian matrix

. . . 3.2 The orthogonality principle

. . . 3.2.1 Representations in infinite-dimensional space

. . . 3.3 Error mininlization via gradients

. . . 3.4 Matrix representations of least-squares problems

. . . 3.4.1 Weighted least-squares

. . . 3.4.2 Statistical properties of the least-squares estimate

. . . 3.5 Minimum error in Hilbert-space approximations

Applications of the orthogonality theorem

. . . 3.6 Approximation by continuous polynomials

. . . 3.7 Approxiination by discrete polynomials

. . . 3.8 Linear regression

. . . 3.9 Least-squares filtering

3.9.1 Least-squares prediction and AR spectrum

. .

. . . estimation

. . . 3.10 Miniruun~ mean-square estimation

. . . 3.1 1 Minimum mean-squared error (MMSE) filtering

. . . 3.12 Comparison of least squares and rniniii~urn mean squares

. . . 3.13 Frequency-domain optimal filtering

3.13.1 Brief review of stochastic processes and

. . . I ~ p l a c e transforms

(9)

Contents vii

3.13.2 Two-sided Laplace transforms and their . .

decompos~tlons . . . 3.13.3 The Wiener-Hopf equation . . . 3.13.4 Solution to the Wiener-Hopf equation . . . 3.13.5 Examples of Wiener filtering . . . 3.13.6 Mean-square error . . . 3.13.7 Discrete-time Wiener filters . . . 3.14 A dual approximation problem

. . .

3.15 Minimum-norm solution of underdetermined equations . . . 3.16 Iterative Reweighted LS (IRLS) for

L,

optimization . . . 3.17 Signal transformation and generalized Fourier series . . . 3.18 Sets of complete orthogonal functions . . . 3.18.1 Trigonometric functions . . . . . . 3.1 8.2 Orthogonal polynomials

3.18.3 Sinc functions . . . 3.18.4 Orthogonal wavelets

. . .

3.19 Signals as points: Digital communications

. . .

3.19.1 The detection problem

. . .

3.19.2 Examples of basis functions used in digital

communications

. . .

3.19.3 Detection in nonwhite noise

. . .

3.20 Exercises . . . 3.2 1 References . . . 4 Linear Operators and Matrix Inverses

4.1 Linear operators

. . .

4.1.1 Linear functionals

. . .

4.2 Operator norms . . .

4.2.1 Bounded operators

. . .

. . . 4.2.2 The Neumann expansion

4.2.3 Matrix norms . . . 4.3 Adjoint operators and transposes . . . . . . 4.3.1 A dual optimization problem

4.4 Geometry of linear equations

. . .

4.5 Four fundamental subspaces of a linear operator

. . .

4.5.1 The four fundamental subspaces with

non-closed range

. . . . . .

4.6 Some properties of matrix inverses

4.6.1 Tests for invertibility of matrices

. . .

4.7 Some results on matrix rank . . .

4.7.1 Numeric rank . . . 4.8 Another look at least squares . . . 4.9 Pseudoinverses . . .

. . .

4.10 Matrix condition number

4.1 1 Inverse of a small-rank adjustment

. . .

4.1 1.1 An application: the RLS filter

. . .

4.1 1.2 Two RLS applications

. . .

4.12 Inverse of a block (partitioned) matrix

. . .

4.12.1 Application: Linear models

. . .

4.13 Exercises

. . .

4.14 References . . .

(10)

viii Contents

5 Some Imporbnt Matrix Factorizations

. . . 5.1 The LU factorization

5.1.1 Computing the determinant using the LU factorization . . . 5.1.2 Computing the LU factorization . . . . . . 5.2 The Cholesky factorization

5.2.1 Algorithms for computing the Cholesky factorjzation

. . .

. . . 5.3 Unitary matrices and the QR factorization

. . . 5.3.1 Unitary matrices

. . . 5.3.2 The QR factorization

. . . 5.3.3 QR factorization and least-squares filters

5.3.4 Computing the QR factorization . . . 5.3.5 Householder transformations . . .

. . . 5.3.6 Algorithms for Householder transformations

5.3.7 QR factorization using Givens rotations . . . 5.3.8 Algorithms for QR factorization using Givens rotations . . . 5.3.9 Solving least-squares problems using Givens rotations . . . . . . 5.3.10 Givens rotations via CORDIC rotations

5.3.1 1 Recursive updates to the QR factorization . . . 5.4 Exercises . . . 5.5 References . . . 6 Eigenvalues and Eigenvectors

6.1 Eigenvalues and linear systems . . . . . . 6.2 Linear dependence of eigenvectors

6.3 Diagonalization of a matrix . . . . . . 6.3.1 The Jordan form

. . . 6.3.2 Diagonalization of self-adjoint matrices

. . . 6.4 Geometry of invariant subspaces

. . . 6.5 Geometry of quadratic forms and the minimax principle

. . . 6.6 Extremai quadratic forms subject to linear constraints

6.7 The Gershgorin circle theorem . . . Application of Eigendecomposition methods

6.8 Karhunen-Lokve low-rank approximations and principal methods . . . . . . . 6.8.1 Principal component methods

6.9 Eigenfilters . . . . . . 6.9.1 Eigenfilters for random signals

6.9.2 Eigenfilter for designed spectral response . . . . . . 6.9.3 Constrained eigenfilters

6.10 Signal subspace techniques . . . . . . 6.10.1 The signal model

. . . 6.10.2 The noise model

6.10.3 Pisarenko harmonic decomposition . . . . . . 6.i0.4 MUSIC

6.1 1 Generalized eigenvalues . . . 6.1 1.1 An application: ESPRIT . . . 6.12 Characteristic and minimal polynomials . . . 6.1 2.1 Matrix polynomials . . . 6.12.2 Minimal polynomials . . . 6.13 Moving the eigenvalues around: Introduction to linear control . . .

6.14 Noiseless constrained channel capacity . . .

(11)

Contents ix

. . . 6.15 Computation of eigenvalues and eigenvectors

. . . 6.15.1 Computing the largest and smallest eigenvalues

6.15.2 Computing the eigenvalues of a symmetric matrix . . . 6.15.3 The QR iteration . . .

. . . 6.16 Exercises

. . . 6.17 References

7 The Singular Value Decomposition

. . . 7.1 Theory of the SVD

. . . 7.2 Matrix structure from the SVD

. . . 7.3 Pseudoinverses and the SVD

. . . 7.4 Numerically sensitive problems

. . . 7.5 Rank-reducing approximations: Effective rank

Applications of the SVD

. . .

7.6 System identification using the SVD

. . . 7.7 Total least-squares problems

. . . 7.7.1 Geometric interpretation of the TLS solution

. . . 7.8 Partial total least squares

. . .

7.9 Rotation of subspaces

. . . 7.10 Computation of the SVD

. . . 7.1 1 Exercises

7.12 References . . . 8 Some Special Matrices and Their Applications

. . .

8.1 Modal matrices and parameter estimation

. . .

8.2 Permutation matrices

. . .

8.3 Toeplitz matrices and some applications

. . .

8.3.1 Durbin's algorithm

. . .

8.3.2 Predictors and lattice filters

. . . 8.3.3 Optimal predictors and Toeplitz inverses

. . . 8.3.4 Toeplitz equations with a general right-hand side

8.4 Vandermonde matrices . . .

. . .

8.5 Circulant matrices

8.5.1 Relations among Vandermonde, circulant, and

companion matrices

. . .

8.5.2 Asymptotic equivalence of the eigenvalues of Toeplitz and

circulant matrices

. . . . . .

8.6 Triangular matrices

. . . 8.7 Properties preserved in matrix products

. . .

8.8 Exercises

. . .

8.9 References

9 Kronecker Products and the Vec Operator

9.1 The Kronecker product and Kronecker sum . . . 9.2 Some applications of Kronecker products . . . 9.2.1 Fast Hadamard transforms . . . 9.2.2 DFT computation using Kronecker products . . . 9.3 The vec operator

. . .

9.4 Exercises

. . .

(12)

x Contents

111 Detection. Estimation. and Optimal Filtering

10 Introduction to Detection and Estimation. and Mathematical Notation . . . 10.1 Detection and estimation theory

. . . 10.1.1 Game theory and decision theory

. . . 10.1.2 Randomization

. . . 10.1.3 Special cases

. . . 10.2 Some notational conventions

. . . 10.2.1 Populations and statistics

. . .

10.3 Conditional expectation

. . .

10.4 Transformations of random variables

. . . 10.5 Sufficient statistics

. . . 10.5.1 Examples of sufficient statistics

. . . 10.5.2 Co~nplete sufficient statistics

. . . 10.6 Exponential families

11 Detection Theory

. . . 11.1 Introduction to hypothesis testing

. . . 1 1.2 Neyman-Pearson theory

. . . 11.2.1 Simple binary hypothesis testing

. . . 1 1.2.2 The Neyman-Pearson lemma

. . . 1 1.2.3 Application of the Neyman-Pearson lemma

1 1.2.4 The likelihood ratio and the receiver operating

. . . characteristic (ROC)

. . . 1 1.2.5 A Poisson example

. . . 1 1.2.6 Some Gaussian examples

. . . 1 1.2.7 Properties of the ROC

. . . 1 1.3 Neyman-Pearson testing with composite binary hypotheses

. . . 11.4 Bayes decision theory

. . . 11.4.1 The Bayes principle

. . . 1 1.4.2 The risk function

. . . 1 1.4.3 Bayes risk

. . . 11.4.4 Bayes tests of simple binary hypotheses

. . . I 1.4.5 Posterior distributions

. . . 1 1.4.6 Detection and sufficiency

. . . 1 1.4.7 Summary of binary decision problems

. . . 11.5 Some M-ary problems

. . . 1 1.6 Maximum-likelihood detection

11.7 Approximations to detection performance: The union bound . . . . . . 1 1.8 Invariant Tests

. . . 1 1.8.1 Detection with random (nuisance) parameters

. . . 11.9 Detection in continuous time

. . . 1 1.9.1 Some extensions and precautions

. . . 1 1.10 Minimax Bayes decisions

. . . 1 1.10.1 Bayes envelope function

. . . 1 1.10.2 Minimax rules

1 1.10.3 Minimax Bayes in multiple-decision problems . . .

(13)

Contents x i

1 1.10.4 Determining the least favorable prior . . . 11.10.5 A minimax example and the minimax theorem . . . 1 1 .I 1 Exercises . . . 1 1.12 References . . . Estimation Theory

12.1 The maximum-likelihood principle . . . 12.2 ML estimates and sufficiency . . . 12.3 Estimation quality . . .

12.3.1 The score function . . . 12.3.2 The Cramiir-Rao lower bound . . . 12.3.3 Efficiency . . . 12.3.4 Asymptotic properties of maximum-likelihood

estimators . . . 12.3.5 The multivariate normal case . . . 12.3.6 Minimum-variance unbiased estimators . . . 1 2.3.7 The linear statistical model . . . 12.4 Applications of ML estimation . . .

12.4.1 ARMA parameter estimation . . . 12.4.2 Signal subspace identification . . . 12.4.3 Phase estimation . . . 12.5 Bayes estimation theory . . . 12.6 Bayes risk . . .

12.6.1 MAP estimates . . . 12.6.2 Summary . . . 12.6.3 Conjugate prior distributions . . . 12.6.4 Connections with minimum mean-squared

estimation . . . 12.6.5 Bayes estimation with the Gaussian distribution . . .

. . .

12.7 Recursive estimation

12.7.1 An example of non-Gaussian recursive Bayes

. . .

12.8 Exercises . . . 12.9 References . . . 13 The Kalman Filter

13.1 The state-space signal model . . . 13.2 Kalman filter I: The Bayes approach . . . 13.3 Kalman filter 11: The innovations approach . . . 13.3.1 Innovations for processes with linear observation models

.

13.3.2 Estimation using the innovations process

. . .

13.3.3 Innovations for processes with state-space models

. . .

13.3.4 A recursion for Ptjr- . . . 13.3.5 The discrete-time Kalman filter . . . 13.3.6 Perspective . . . 13.3.7 Comparison with the RLS adaptive filter algorithm . . . 13.4 Numerical considerations: Square-root filters . . . 13.5 Application in continuous-time systems . . .

13.5.1 Conversion from continuous time to discrete time . . . 13.5.2 A simple kinematic example . . . 13.6 Extensions of Kalman filtering to nonlinear systems . . .

(14)

uii Contents

. . . 13.7 Smoothing

. . . 13.7.1 The Rauch-Tung-Streibel fixed-interval smoother

. . . 13.8 Another approach: ti, smoothing

IV Iterative and Recursive Methods in Signal Processing

14 Basic Concepts and Methods of Iterative Algorithms 14.1 Definitions and qualitative properties of iterated

. . . functions

. . . 14.1.1 Basic theorems of iterated functions

. . . 14.1.2 Illustration of the basic theorems

. . . 14.2 Contraction mappings

. . . 14.3 Rates of convergence for iterative algorithms

. . . 14.4 Newton's method

. . . 14.5 Steepest descent

14.5.1 Comparison and discussion: Other techniques . . . Some Applications of Basic Iterative Methods

. . . 14.6 LMS adaptive Filtering

. . . 14.6.1 An example LMS application

. . . 14.6.2 Convergence of the LMS algorithm

. . . 14.7 Neural networks

. . . 14.7.1 The backpropagation training algorithm

. . . 14.7.2 The nonlinearity function

. . . 14.7.3 The forward-backward training algorithm

. . . 14.7.4 Adding a momenturn term

. . . 14.7.5 Neural network code

. . . 14.7.6 How many neurons?

. . . 14.7.7 Pattern recognition: ML or NN?

. . . 14.8 Blind source separation

. . . 14.8.1 A bit of infonnation theory

. . . 14.8.2 Applications to source separation

. . . 14.8.3 Implementation aspects

14.10 References . . . 15 Iteration by Composition of Mappings

. . . 15.1 Introduction

. .

. . . 15.2 Alternating projectrons

15.2.1 An applications: handlimited reconstruction . . . . . . 15.3 Composite mappings

15.4 Closed mappings and the global convergence theorem . . . 15.5 The cotnposite mapping algorithm . . . . . . 15.5.1 Bandlinlited reconstruction

.

revisited

15.5.2 An example: Positive sequence determination . . . . . . 15.5.3 Matrix property mappings

. . .

15.6 Projection on convex sets

15.7 Exercises . . . ... . . . ... . . . . .

15.8 Rckrencei

(15)

Contents xiii

16 Other Iterative Algorithms

. . . 16.1 Clustering

. . . 16.1.1 An example application: Vector quantization

. . .

16.1.2 An example application: Pattern recognition

. . . 16.1.3 k -means Clustering

. . . 16.1.4 Clustering using fuzzy k -means

. . . 16.2 Iterative methods for computing inverses of matrices

16.2.1 The Jacobi method . . .

. . .

16.2.2 Gauss-Seidel iteration

. . .

16.2.3 Successive over-relaxation (SOR)

. . . 16.3 Algebraic reconstruction techniques (ART)

. . .

16.4 Conjugate-direction methods

. . . 16.5 Conjugate-gradient method

. . . 16.6 Nonquadratic problems

17 The EM Algorithm in Signal Processing

. . .

17.1 An introductory example

. . .

17.2 General statement of the EM algorithm

17.3 Convergence of the EM algorithm

. . . . . .

17.3.1 Convergence rate: Some generalizations

Example applications of the EM algorithm

. . .

17.4 Introductory example, revisited

17.5 Emission computed tomography (ECT) image reconstruction

. . .

17.6 Active noise cancellation (ANC)

. . .

177 HiddenMarkovmodels

. . .

17.7.1 TheE-andM-steps

. . . ...

17.7.2 The forward and backward probabilities

. . .

17.7.3 Discrete output densities

. . .

17.7.4 Gaussian output densities

. . .

17.7.5 Normalization

. . .

17.7.6 Algorithms for HMMs

. . .

17.8 Spread-spectrum, multiuser communication

. . .

17.9 Summary

. . .

17.10 Exercises

. . .

17.1 1 References

V Methods of Optimization

18 Theory of Constrained Optimization

. . .

18.1 Basic definitions

18.2 Generalization of the chain rule to composite functions

. . . . . .

18.3 Definitions for constrained optimization

. . .

18.4 Equality constraints: Lagrange multipliers

. . . 18.4.1 Examples of equality-constrained optimization

. . .

18.5 Second-order conditions

. . .

18.6 Interpretation of the Lagrange multipliers

. . .

18.7 Complex constraints

. . .

18.8 Duality in optimization

(16)

xiv Conten&

. . .

18.9 Inequality constraints: Kuhn-Tucker conditions

. . . 18.9.1 Second-order conditions for inequality constraints

. . . 18.9.2 An extension: Fritz John conditions

. . . 1 8.10 Exercises

18.11 References . . . 19 Shortest-Path Algorithms and Dynamic Programming

. . . 19.1 Definitions for graphs

. . . 19.2 Dynamic programming

. . . 19.3 The Viterbi algorithm

. . . 19.4 Code for the Viterbi algorithm

. . . 19.4.1 Related algorithms: Dijkstra's and Warshall's

. . . 19.4.2 Complexity comparisons of Viterbi and Dijkstra

Applications of path search algorithms

. . . 19.5 Maximum-likelihood sequence estimation

. . . 19.5.1 The intersymbol interference (ISI) channel

. . . 19.5.2 Code-division multiple access

. . . 19.5.3 Convolutional decoding

. . . 19.6 HMM likelihood analysis and HMM training

. . . 19.6.1 Dynamic warping

. . . 19.7 Alternatives to shortest-path algorithms

19.8 Exercises . . . . . . 19.9 References

20 Linear Programming

. . . 20.1 Introduction to linear programming

. . . 20.2 Putting a problem into standard form

. . . 20.2.1 Inequality constraints and slack variables

. . . 20.2.2 Free variables

. . . 20.2.3 Variable-bound constraints

. . . 20.2.4 Absolute value in the objective

. . . 20.3 Simple examples of linear programming

. . . 20.4 Computation of the linear programming solution

. . . 20.4.1 Basic variables

. . . 20.4.2 Pivoting

. . . 20.4.3 Selecting variables on which to pivot

. . . 20.4.4 The effect of pivoting on the value of the problem

. . . 20.4.5 Summary of the simplex algorithm

. . . 20.4.6 Finding the initial basic feasible solution

. . . 20.4.7 MATLAB@ code for linear programming

. . . 20.4.8 Matrix notation for the silnplex algorithm

20.5 Dual problems . . . . . . 20.6 Karmarker's algorithm for LP

. . . 20.6.1 Conversion to Karmarker standard form

. . . 20.6.2 Convergence of the algorithm

. . . 20.6.3 Su~nrnary and extensions

Examples and applications of linear programming

. . . 20.7 Linear-phase FIR filter design . .

. . . 20.7.1 Least-absolute-error approximation

. . . 20.8 Linear optimal control

(17)

Contents xv

A Basic Concepts and Definitions

. . . A.! Set theory and notation

. . . A.2 Mappings and functions

. . . A.3 Convex functions

. . . A.4 0 and o Notation

. .

A.5 Continulty . . . . .

. . . A.6 Differentiation

. . . A.6.1 Differentiation with a single real variable

. . . A.6.2 Partial derivatives and gradients on

Rm

. . . A.6.3 Linear approximation using the gradient

. . . A.6.4 Taylor series

. . . A.7 Basic constrained optimization

. . . A.8 The Holder and Minkowski inequalities

A.9 Exercises . . . . . . A . I0 References

B Completing the Square

. . . B

.

1 The scalar case

. . . B.2 The matrix case

. . . B.3 Exercises

C Basic Matrix Concepts

. . . C . 1 Notational conventions

. . . C.2 Matrix Identity and Inverse

. . . C.3 Transpose and trace

. . . C.4 Block (partitioned) matrices

. . . C.5 Determinants

. . . C.5.1 Basic properties of determinants

. . . C.5.2 Formulas for the determinant

. . . C.5.3 Determinants and matrix inverses

. . . C.6 Exercises

. . . C.7 References

D Random Processes

. . . D.l Definitions of means and correlations

. . . D.2 Stationarity

. . .

D.3 Power spectral-density functions

. . . D.4 Linear systems with stochastic inputs

. . . D.4.1 Continuous-time signals and systems

. . . D.4.2 Discrete-time signals and systems

. . .

D.5 References

E Derivatives and Gradients

. . .

E . 1 Derivatives of vectors and scalars with respect to a real vector

. . . E . 1.1 Some important gradients

. . .

E.2 Derivatives of real-valued functions of real matrices

. . . E.3 Derivatives of matrices with respect to scalars, and vice versa

. . .

E.4 The transformation principle

. . . E.5 Derivatives of products of matrices

(18)

xvi ContenLs

. . .

E.6 Derivatives of powers of a matrix 904

. . .

E.7 Derivatives involving the trace 906

. . . E.8 Modifications for derivatives of complex vectors and matrices 908

. . .

E.9 Exercises 910

. . .

E.10 References 912

F Conditional Expectations of Multinomial and Poisson r.v.s 913 . . .

F.l Multinomial distributions 913

. . .

F.2 Poisson random variables 914

. . .

F.3 Exercises 914

Bibliography 915

(19)

List of Figures

. . . Input loutput relation for a transfer function

. . . Realization of the AR part of a transfer function

. . . Realization of a transfer function

Realization of a transfer function with state-variable labels

_{. .}

. . . Prediction error . . . Linear predictor as an inverse system . . . . . . PSD input and output

. . . Representation of an adaptive filter

Identification of an unknown plant . . . . . . Adapting to the inverse of an unknown plant

An adaptive predictor . . . Configuration for interference cancellation . . . The Gaussian density . . . Demonstration of the central limit theorem . . . . . . Plot of two-dimensional Gaussian distribution

A simple Markov model . . . A hidden Markov model . . . An HMM with four states . . . . . . Binary symmetric channel model

LFSR realization . . . Alternative LFSR realization . . .

. . . A binary LFSR and its output

. . . Simple feedback configuration

Illustration of the triangle inequality . . . Quantization of the vector x . . . . . . Comparison of

d.

and

dz

metrics

xo

is interior. xz is exterior. and xl is neither interior nor exterior . . . . Illustration of open and closed sets . . .

. . . The function f .

( t )

Illustration of Gibbs phenomenon . . . . . . A subspace of

IR3

A triangle inequality interpretation . . . Unit spheres in IR2 under various l p norm . . . Chebyshev polynomials To(t) through T s ( t ) for

t E [-

1.

1 ]

. . .

. . . A space and its orthogonal complement

. . . Disjoint lines in R*

. . . Decomposition of x into disjoint components

Orthogonal projection finds the closest point in V to x . . .

Orthogonal projection onto the space spanned by several vectors . . .

(20)

xviii List of Figures

. . . The projection theorem

. . . The first steps of the Gram-Schmidt process

. . . Third step of the Gram-Schmidt process

. . .

The parallelogram law

. . . Functions to orthogonalize

. . . The approximation problem

. . . Approximation with one and two vectors

. . . An error surface for two variables

. . . Projection solution

. . . Statistician's Pythagorean theorem

Comparison of LS. WLS. and Taylor series approximations to e' . . . . . .

A

discrete function and the error in its approximation

. . . Data for regression

Illustration of least-squares and weighted least-squares lines . . . . . . Least-squares equalizer example

An equalizer problem . . . . . . Contour plot of an error surface

Pole-zero plot of rational S ,

(s) ( x =

poles.

o =

zeros) . . .

y,

as the output of a linear system driven by white noise . . .

v,

as the output of a linear system driven by

y,

. . . The optimal filter as the cascade of a whitening filter and a Wiener filter with white-noise inputs . . .

. . . Minimum norm to a linear variety

. . . Magnitude response for filters designed using IRLS

Legendre polynomials

p o ( t )

through

p s ( t )

for

t E [-

1. I ] . . . A function f

( t )

and its projection onto

VO

and

V-l . . .

The simplest scaling and wavelet functions . . . Illustration of scaling and wavelet functions

. . .

. . . Illustration of a wavelet transform

Multirate interpretation of wavelet transform . . . Illustration of the inverse wavelet transform

. . .

Filtering interpretation of an inverse wavelet transform . . .

. . .

Perfect reconstruction filter bank

Two basis functions. and some functions represented by using them Implementations of digital receiver processing . . . Digital receiver

processing

. . . In~plementation of a matched filter rece~ver . . . PSK signal constellation and detection example . . . Illustration of concepts of various signal constellations . . . Block diagram for detection processing . . . Geometry of the operator norm

. . .

Intersections of lines form solutions of systems of linear equations . Intersecting planes: (a) no solution (b) infinite number of solutions . The four fundamental subspaces of a matrix operator . . .

. . .

Operation of the pseudoinverse

Demonstration

of

an ill-conditioned linear system

. . .

Condition

of

the Hilbert matrix

. . .

Condition number for a bad idea

. . .

RLS adaptive eq ualiztr

(21)

L i s t of Figures xix

. . . Illustration of RLS equalizer performance

System identification using the RLS adaptive filter . . . Illustration of system identification using the RLS filter . . .

. . . The Householder transformation of a vector

Zeroing elements of a vector by a Householder transformation . . . . Two-dimensional rotation . . . The direction of eigenvectors is not modified by A . . .

. . . The geometry of quadratic forms

Level curves for a Gaussian distribution . . . The maximum principle . . . Illustration of Gershgorin disks . . .

. . . Scatter data for principal component analysis

Noisy signal to be filtered using an eigenfilter h . . . Magnitude response specifications for a lowpass filter . . . Eigenfilter response . . . Response of a constrained eigenfilter . . . . . . The MUSIC spectrum for example 6.10.2.

Plant with reference input and feedback control . . . State diagram for a constrained channel . . . Direct and indirect transmission through a noisy channel . . . Expansion and interpolation using multirate processing . . . Transformation from a general matrix to first companion form . . . . Illustration of the sensitive direction . . . Comparison of least-squares and total least-squares fit . . .

. . . PTLS linear parameter identification

. . . A data set rotated relative to another data set

. . . The first two stages of a lattice prediction filter

. . . The kth stage of a lattice filter

Comparison of S(o) and the eigenvalues of

R, for

n

=

30 and

n =

100 . . . 4-point fast Hadamard transform . . . . . . 6-point DFT using Kronecker decomposition

. . . Loss function (or matrix) for "odd or even" game

. . . Elements of the statistical decision game

. . . A simple binary communications channel

A

typical payoff matrix for the Prisoner's Dilemma game . . . . . . Illustration of threshold for Neyman-Pearson test

. . . Scalar Gaussian detection of the mean

Error probabilities for Gaussian variables with different

. . . means and equal variances

. . . ROC for Gaussian detection

Test for vector Gaussian random variables with different means . . .

. . .

Probability of error for BPSK signaling

(22)

XX List of Figures

. . . An orthogonal and antipodal binary signal constellation

ROC: normal variables with equal means and unequal variances . . . . . Demonstration of the concave property of the ROC

. . . Illustration of even-odd observations

. . . Risk function for statistical odd or even game

. . . A binary channel

. . . Risk function for binary channel

. . . Loss function

. . . Bayes risk for a decision

Geometry of the decision space for multivariate Gaussian detection Decision boundaries for a quaternary decision problem . . . . . . Venn diagram for the union of two sets

. . . Bound on the probability of error for PSK signaling

A

test biased by y c . . . . . . Channel gain and rotation

. . . Incoherent binary detector

. . . Probability of error for BPSK

A projection approach to signal detection . . . Bayes envelope function . . . Bayes envelope function: normal variables with unequal

means and equal variances . . . Bayes envelope function for example 1 1.4.5 . . .

. . . Bayes envelope for binary channel

. . . Geometrical interpretation of the risk set

Geometrical interpretation of the minimax rule . . . The risk set and its relation to the Neyman-Pearson test

. . .

Risk function for statistical odd or even game . . . Risk set for odd or even game . . . Risk set for the binary channel . . . . . . Regions for bounding the Q function

Channel with Laplacian noise and decision region . . . Some signal constellations . . . . . . Signal constellation with three points

Empiric distribution function . . . Explicitly computing the estimate of the phase

. . .

A phase-locked loop . . . Illustration of the update and propagate steps in

_. _.

sequential estrmation . . . Acoustic level framework . . . Equivalent representations for the Gaussian estimation problem

. . .

Illustration of Kalman filter

. . .

Illustration of an orbit of a function with an attractive fixed point . . Illustration of an orbit of a function with a repelling fixed point . . . Examples of dynamical behavior on the quadratic logistic map

. . . . . . .

Illustration of g(x)

= f ( f

(x)) when

ii =

3.2 Iterations of an affine transformation . acting on a square

. . . . . .

Illustration of Newton'., method

(23)

List of Figures xxi

Contour plots of Rosenbrock's function and Newton's method . . . . A function with local and global minima . . . Convergence of steepest descent on a quadratic function . . . Error components in principal coordinates for steepest descent

. . . .

Error in the LMS algorithm for

p =

0.075 and

p =

0.0075, compared with the RLS algorithm, for an adaptive equalizer

. . . problem

Optimal equalizer coefficients and adaptive equalizer coefficients

.

Representation of the layers of an artificial neural network . . . An artificial neuron . . . Notation for a multilayer neural network . . .

. . . The sigmoidal nonlinearity

Pattern-recognition problem for a neural network . . . Desired output (solid line) and neural network output (dashed line) Effect of convergence rate on

p

and

a

. . . The blind source-separation problem . . . The binary entropy function H

(p)

. . . Illustration of a projection on a set . . . Projection on convex sets in two dimensions . . .

. . . Results of the bandlimited reconstruction algorithm

. . . Property sets in X and their intersection P

Illustration of the composition of point-to-set mappings . . . Projection onto a non-convex . . . Producing a positive sequence from the Hamming window . . . Results from the application of a composite mapping

. . . algorithm to sinusoidal data

. . . Geometric properties of convex sets

. . . Projection onto two convex sets

. . . Demonstration of clustering

. . . Clusters for a pattern recognition problem

. . . Illustration of iterative inverse computation

Residual error in the ART algorithm as a function of iteration . . . Convergence of conjugate gradient on a quadratic function . . . . . . An overview of the EM algorithm

. . . Illustration of a many-to-one mapping from X to y

. . . Representation of emission tomography

Detector arrangement for tomographic reconstruction example . . . . . . . Example emission tomography reconstruction

Single-microphone ANC system . . . . . . Processor block diagram of the ANC system

log P ( ~ T ^I

B [ ~ ] )

for an HMM . . . . . . Representation of signals in an SSMA system

. . . Multiple-access receiver matched-filter bank

Examples of minimizing points . . . Contours of f

( x , . x2).

showing minimum and

constrained minimum . . .

(24)

xxii List of Figures

. . . Relationships between variables in composite functions

. . . Illustration of functional dependencies

. . . Surface and contour plots of f ( x l . x2)

. . . Tangent plane to a surface

. . . Curves on a surface

. . . Minimizing the distance to an ellipse

. . . The projection of Ly into P to form Lp

Duality: the nearest point to K is the maximum distance to

. . . a separating hyperplane

. . . The dual function

g (A)

. . . Saddle surface for minimax optimization

Illustration of the Kuhn-Tucker condition in a single dimension . . . . . . Illustration of "waterfilling" solution

. . . Graph examples

. . . A multistage graph

. . . A trellis diagram

. . . State machine corresponding to a trellis

State-machine output observed after passing through a

. . . noisy channel

. . . Steps in the Viterbi algorithm

. . . A trellis with irregular branches

. . .

A

trellis with multiple outputs

. . . MLSE detection in IS1

. . . Trellis diagram and detector structure for IS1 detection

. . . CDMA signal model

. . . CDMA detection

. . . Convolutional coding

. . . Comparing HMM training algorithms

. . . Illustration of the warping alignment process

. . . Probability of failure of network links

. . . 20.1 A linear programming problem

. . . 20.2 Illustration of Karmarker's algorithm

. . . 20.3 Filter design constraints

20.4 Frequency and impulse response of a filter designed using

. . . linear programming

( n =

45 coefficients)

. . . .

A 1

Illustration of convex and nonconvex sets

. . . A.2 Indicator functions for some simple sets

. . . A.3 Illustration of a convex function

. . .

A.4 Illustration of the definition of continuity

. . .

A.5

A

constrained optimization problem

A.6

The indicator function for a fuzzy number "near 10" . . .

A.7 The set sum

. . .

(25)

List of Algorithms

. . . Massey 's algorithm

(

pseudocode)

. . . Massey's algorithm

. . . Gram-Schmidt algorithm (QR factorization)

. . . Least-squares filter computation

. . . Forward-backward linear predictor estimate

. . . Two-tap channel equalizer

. . . Iterative reweighted least-squares

. . . Filter design using IRLS

. . . Some wavelet coefficients

. . . Demonstration of wavelet decomposition

Demonstration of wavelet decomposition (alternative indexing) . . . . . . Nonperiodic wavelet transform

. . . Nonperiodic inverse wavelet transform

Periodic wavelet transform . . . Inverse periodic wavelet transform . . . The RLS algorithm . . .

. . . The RLS algorithm (MATLAB@ implementation)

. . . LU factorization

. . . Cholesky factorization

. . . Householder transformation functions

. . . QR factorization via Householder transformations

. . . Computation of Q b

. . . Computation of Q from

V

. . . Finding cos

0 and sin 0 for a Givens rotation

. . . QR factorization using Givens rotations

Computation of Q H b for the Givens rotation factorization . . . . . . Computation of Q from 0

. . . Eigenfilter design

. . . Constrained eigenfilter design

. . . Pisarenko harmonic decomposition

. . . Computation of the MUSIC spectrum

Computation of the frequency spectrum of a signal using ESPRIT . Computation of the largest eigenvalue using the power method . . . . Computation of the smallest eigenvalue using the power method

. .

Tridiagonalization of a real symmetric matrix . . .

. . . Implicit QR shift

. . . Complete eigenvalue/eigenvector function

. . . System identification using SVD

. . .

Total least squares

(26)

xxiv List of Algorithms

. . .

Partial total least squares. part

1

. . . Partial total least squares part . 2

. . . Computing the SVD

. . . Durbin's algorithm

. . . Conversion of lattice FIR to direct-form

. . . Conversion of direct-form FIR to lattice

. . . Levinson's algorithm

. . . Example Bayes minimax calculations

. . . Maximum-likelihood ARMA estimation

. . .

Kalman filter I

. . . Kalman filter example

. . . Logistic function orbit

. . . LMS adaptive filter

. . . Neural network forward-propagation algorithm

. . . Neural network backpropagation training algorithm

. . . Neural network test example

. . . Blind source separation test

. . . Bandlimited reconstruction using alternating projections

. . . Mapping to a positive sequence

. . . Mapping to the nearest stochastic matrix

. . . Mapping to a Hankel matrix of given rank

. . . Mapping to a Toeplitz/Hankel matrix stack of given rank

. . . k-means clustering (LGB algorithm)

Jacobi iteration

. . . .,

. . .

. .

. . . Gauss-Se~del iteration

. . .

Successive over-relaxation

. . . Algebraic reconstruction technique

Conjugate-gradient solution of a symmetric

. . . linear equation

Conjugate-gradient solution for

. . . unconstrained minimization

EM algorithm example co~nputations . . . Simulation and reconstruction of emission tomography . . . . . . Overview of HMM data structures and functions

. . . HMM likelihood conlputation functions

. . . HMM model update functions

. . . HMM generation functions

. . . A constrained optimization of a racing problem

Forward dynamic programming . . . The Viterbi algorithm

. . .

Initializing the Viterbi algorithm

. . .

Flushing the shortest pat11 in the VA

. . .

Dijkstra's shortest-path algorithm . . . Warshall's transitive closure algorithm . . . Norm and initialization for Viterbi WMM computations

. . .

Best-path likelihood for the HMM

. . .

HMM training using Viterbi methods

. . .

Use of the Vitcrbi methods with HMMs

Warping

code . . .

(27)

1. ist of

.

ilgorithm5 Y X V

. . .

20.1 The simplex algorithn~ for linear programming 834

. . .

20.2 Tableau pivoting for the simplex algorithm 834

20.3 Elimination and backsubsritution of free variables

. . .

for linear programming 834

. . .

20.4 Karmarker's algorithm for linear programming 842

. . .

20.5 Conversion of standard form

to

Karmuker standard form 844

. . .

20.6 Optimal filter design using linear programming 847

(28)

(29)

List of Boxes

Box 1.1 Box 1.2 Box 1.3 Box 1.4 Box 1.5 Box 2 .I Box 2.2 Box 2.3 Box 2.4 Box 3.1 Box 4.1 Box 5.1 Box 6.1 Box 7.1 Box 11.1 Box 1 1.2 Box

1

1.3 Box 1 1.4 Box 12.1 Box 12.2 Box 14.1

Notation for complex quantities

. . .

7 Notation for vectors

. . .

9 Notation for random variables and vectors . . . 31 Groups. rings. and fields

. . .

49 GF(2) . . . 50

. . .

Sup and inf 74

. . .

The measure of a set 82

David Hilbert ( 1 862-1943) . . . 107 . . .

Isomorphism 112

Positive-definite matrices

. . .

134 . . .

James H . Wilkinson (1 9 19- 1986) 254

. . .

Carl Friedrich Gauss

(1

777-1 855) 278

. . .

Arg max and arg min 326

Commutative diagrams . . . 375 . . .

The Q function 472

The I7 _function

. . .

478 . . .

The

t

distribution 507

. . .

The function

Zo(x)

510 The B distribution . . . 575 . . .

The r distribution 576

. . .

Isaac Newton (1 642-1727) 633

(30)

(31)

Preface

Rationale

The purpose of this book is to bndge the gap between introductory signal processing classes and the mathematics prevalent in contemporary signal processing research and practice, by providing a unified ripplied treatment of fundamental mathematics, seasoned with demonstrations using

MATLAB@.

This book is intended not only for current students of signal processing, but also for practicing engineers who must be able to access the signal processing research literature, and for researchers looking for a particular result to apply. It is thus intended both as a textbook and as a reference.

Both the theory and the practice of signal processing contribute to and draw from a variety of disciplines: controls, communications, system identification, in- formation theory, artificial intelligence, spectroscopy, pattern recognition, tomog- raphy, image analysis, and data acquisition, among others. To fulfill its role in these diverse areas, signal processing employs a variety of mathematical tools, includ- ing transform theory, probability, optimization, detection theory, estimation theory, numerical analysis, linear algebra, functional analysis, and many others. The prac- titioner of signal processing-the "signal processorH-may use several of these tools in the solution of a problem; for example, setting up a signal reconstruction algorithm, and then optimizing the parameters of the algorithm for optimum per- formance. Practicing signal processors must have knowledge of both the theory and the implementation of the mathematics: how and why it works, and how to make the computer do it. The breadth of mathematics employed in signal processing, coupled with the opportunity to apply that math to problems of engineering interest, makes the field both interesting and rewarding.

The mathematical aspects of signal processing also introduce some of its major challenges: how is a student or engineering practitioner to become versed in such a variety of mathematical techniques while still keeping an eye toward applica- tions? Introductory texts on signal processing tend to focus heavily on transform techniques and filter-based applications. While this is an essential part of the train- ing of a signal processor, it is only the tip of the iceberg of material required by a practicing engineer. On the other hand, more advanced texts typically develop mathematical tools that are specific to a narrow aspect of signal processing, while perhaps missing connections between these ideas and related areas of research.

Neither of these approaches provides sufficient background to read and understand broadly in the signal processing research literature, nor do they equip the student with many signal processing tools.

The signal processing literature has moved steadily toward increasing sophisti-

cation: applications of the singular value decomposition (SVD) and wavelet trans-

forms abound; everyone knows something about these by now, or should! Part of this

move toward sophistication is fueled by computer capabilities, since computations

(32)

xxx Preface

that formerly required considerable effort and understanding are now embodied in convenient mathematical packages. A naive view might held that this automation threatens the expertise of the engineer: Why hire a specialist to do what anyone can do in ten minutes with a MATLAB toolbox? Viewed more positively, the power of the computer provides a variety of new opportunities, as engineers are freed from computational drudgery to pursue new applications. Computer software provides platforms upon which innovative ideas may be developed with ever greater ease.

Taking advantage of this new freedom to develop useful concepts will require a solid understanding of mathematics, both to appreciate what is in the toolboxes and to extend beyond their limits. This book is intended to provide a foundation in the requisite mathematics.

We assume that students using this text have had a course in traditional transform-based digital signal processing at the senior or first-year graduate level, and a traditional course in stochastic processes. Though basic concepts in these areas are reviewed, this book does not supplant the more focused coverage that these courses provide.

Features

*

Vector-space geometry, which puts least-squares and minimum mean-squares in the same framework, and the concept of signals as vectors in an appropri- ate vector space, are both emphasized. This vector-space approach provides a natural framework for topics such as wavelet transforms and digital com- munications, as well as the traditional topics of optlmum prediction, filtering, and estimation. In this context, the more general notion of metric spaces is introduced, with a discussion of signal norms.

*

The linear algebra used in signal processing is thoroughly described, both in concept and in numerical implementation. While software libraries are com- monly available to perform linear algebra computations, we feel that the nu- merical techniques presented in this book exercise student intuition regarding the geometry of vector spaces, and build understanding of the issues that must be addressed in practical problems.

The presentation includes a thorough discussion of eigen-based methods of computation, ~ncluding eigenfilters, MUSIC, and ESPRIT; there is also a chapter devoted to the properties and applications of the SVD. Toeplitz matrices, which appear throughout the signal processing literature, are treated both from a numerical point of view-as an example of recursive algorithms- and in conjunction with the lattice-filtering interpretation.

The matrices in linear algebra are viewed as operators; thus, the important concept of an operator is introduced. Associated notions, such as the range.

nullspace, and norm of an operator are also presented. While a full coverage of operator theory is not provided, there is

a

strong foundation that can serve to build m i g h t into other operators.

*

In addition to linear algebraic concepts, there is a discussion of

tnt?zputatlorz.

Algorithms are presented for computing the common factor~zations, eigen-

values, eigenvectors, SVDs, and many other problems, with \ome numerical

consideration for implementation. Not all of this material 1s neceswily in-

tended for classroom use in a conventional signal processing cour\e-there

will not be sufficient time In mo\t case\. Nonethere\\,

it

prov~dcs an important

(33)

flreface xxxi

perspective

to prospective

practitioners,

and start~ng polnt for rmplementa- tions on other platform5 Instructors may choose to empha\~ze certaln numerlc concepts because they highlight particular top~cs, such as the geometry of vector spaces.

The Cauchy-Schwartz mequality is used in a variety of places as an optimizing principle.

*

Recursive least square and least mean square adaptive filters are presented as natural outgrowths of more fundamental concepts: matrix inverse updates and steepest descent. Neural networks and blind source separation are also presented as applications of steepest descent.

Several chapters are devoted to iterative and recursive methods. Though it- erative methods are of great theoretical and practical significance, no other signal processing textbook provides a similar breadth of coverage. Methods presented include projection on convex sets, composite mapping, the EM al- gorithm, conjugate gradient, and methods of matrix inverse computation using iterative methods.

*

Detection and estimation are presented with several applications, including spectrum estimation, phase estimation, and multidimensional digital commu- nications.

Optimization is a key concept in signal processing, and examples of optimiza- tion, both unconstrained and constrained, appear throughout the text. Both a theoretical justification for Lagrange multiplier methods and a physical inter- pretation are explicitly spelled out in a chapter on optimization. A separate chapter discusses linear programming and its applications. Optimizations on graphs (shortest-path problems) are also examined, with a variety of applica- tions in communications and signal processing.

The EM algorithm as presented here is the only treatment in a signal processing textbook that we are aware of. This powerful algorithm is used for many otherwise intractable estimation and learning problems.

In general, the presentation is at a more formal level than in many recent digital signal processing texts, following a "theorem/proof" format throughout. At the same time,

it

is less formal than many math texts covering the same material. In this, we have attempted to help the student become comfortable with rigorous thinking, without overwhelming them with technicalities. (A brief review of methods of proofs is also provided to help students develop a sense of how to approach the proofs.) Ultimately, the aim of this book is to teach its reader how to think about problems. To this end, some material is covered more than once, from different perspectives (e.g., with more than one proof for certain results), to demonstrate that there is usually more than one way to approach a problem.

Throughout the text, the intent has been to explain the "what" and the "why"

of the mathematics, but not become overwrought with some of the more technical

mathematical preoccupations. In this regard, the book does not always thoroughly

treat questions of "how well." (For example. in our coverage of linear numerical

analysis, the perturbation analysis that characterizes much of the research literature

has been largely ignored. Nor do issues of computational complexity form a major

consideration.) To visualize this approach, consider an automotive analogy: Our

intent is to "get under the hood" to a sufficient degree that

it

is clear why the engine

(34)

xxxii Preface

runs and what it can do, but not to provide a molecular-level description of the metallurgical structure of the piston rings. Such fine-grained investigations might be a necessary part of research into fine-tuning the performance of the engine--or the algorithm-but are not appropriate for a student learning the basic mechanics.

Throughout the chapters and in the appendices, there is a great deal of material that will be of reference value to practicing engineers. For example, there are facts regarding matrix rank, the invertibility of matrices, properties of Hermitian matrices, properties of structured matrices preserved under multiplication, and an extensive table of gradients. Not all of this material is necessarily intended for classroom use, but is provided to enhance the value of the book as a reference. Nevertheless, where such reference material is provided, it is usually accompanied by an explanation of its derivation, so that related facts may often be derived by the reader.

Though this book does not provide the final word in any research area, for many research paths it will at least provide a good first step. The contents of the book have been selected according to a variety of criteria. The primary criterion was whether material has been of use or interest to us in our research; questions from students and the need to find clear explanations, exceptional writings found in other textbooks and papers, have also been determining factors. Some of the material has been included for its practicality, and some for its outstanding beauty.

In the ongoing debate regarding the teaching of mathematics to engineers, recent proposals suggest using "just in time" mathematics: provide the mathematical concept only when the need for

it

arises in the solution of an engineering problem.

This approach has arisen as a response to the charge that mathematical pedagogy has been motivated by a "just in case" approach: we'll teach you all this stuff just in case you ever happen to need it. In reality, these approaches are neither fully desirable nor achievable, potentially lacking rigor and depth on the one hand. and motivation and insight on the other. As an alternative, we hope that the presentation in this book is "justified," so that the level of mathematics is suited to its application, and the applications are seen in conjunction with the concepts.

Programs

The algorithms found throughout the text, written in MATLAB, allow the reader to see how the concepts developed in the text might be implemented, allow easy exploration of the concepts (and, sometimes, of the limitations of the theory), and provide a useful library of core functionality for a variety of signal processing research applications. With thorough theoretical and applied discussion surrounding each algorithm, this is not simply a book of recipes; raw ingredients are provided to stir up some interesting stews!

In most cases, the algorithms themselves have not been presented in the text.

Instead, an icon (as shown below)

is used to ind~cate that the text an algor~thm is to be found on the ~ncluded CD-ROM

( ~ n \ome inctances the

algorithm

consi4ts of several related file\)

(35)

preface xxxiii

In the interest of brevity, type-checking of arguments has not been incorporated

into

the functions. Otherwise, we believe that all of the code provided works, at least to produce the examples described in the book. Of course, information regarding program bugs, fixes, and improvements is always welcome. Nevertheless, we are required to make the standard disclaimer of warranty which can be found on the last page of the book.

Readers are free to use the programs or any derivatives of them for any icientific purpose, with appropriate citation of this book. Updated versions of the programs, and other information, can be found at the following website:

www.prenhall.com/moon

Exercises

The exercises found at the end of each chapter are loosely divided into sections, but

~t

may be necessary to draw from material in other sections (or even other chapters) in order to solve some of the problems.

There are relatively few merely numerical exercises. With the computer per- forming automated computations in many cases, simply running numbers doesn't provide an informative exercise. Readers are encouraged, of course, to play around with the algorithms to get a sense of how they work. Insight frequently can be gained on some difficult problems by trying several related numerical approaches.

The intent of the exercises is to engage the reader in the development of the theory in the book. Many of the exercises require derivations of results presented in the chapters, or proofs of some of the lemmas and theorems; other exercises require programming an extension or modification of a MATLAB algorithm presented in the chapter; and still others lead the student through a step-by-step process leading to some significant result (for example, a derivation of Gaussian quadrature or linear prediction theory, extension of inverses of Toeplitz matrices, or another derivation of the Kalman filter). As students work through these exercises, they should develop skill in organizing their thinking (which can help them to approach other problems) as well as acquire background in a variety of important topics.

Most of the exercises require a fair degree of insight and effort to solve- students should plan on being challenged. Wherever possible, students are en- couraged to interact with the computer for computational assistance, insight, and feedback.

A solutions manual is available to instructors who have adopted the book for classroom use. Not only are solutions provided but, in many cases, MATLAB and

M A T H E M A T I C A ~

code is also provided, indicating how a problem might be ap- proached using the computer. Solutions to selected exercises can also be found on the CD-ROM.