II v (A)II = max [p (~)1. (i.3) ~..E~=I; EjEk=O, i:ek; Ek-E=E~.- k AA =AA*. (A/, g)=(/, A'g) gfF. AEB(F)is f, g EF Metric criteria of normality for complex matrices of order less than 5

(1)

A R K I V F O R M A T E M A T I K B a n d 3 n r 40

C o m m u n i c a t e d 24 A p r i l 1957 b y LENNAI~T CARL~SON

M e t r i c criteria o f n o r m a l i t y f o r c o m p l e x m a t r i c e s o f o r d e r less t h a n 5

EDGAR ASPLUND 1

I. Introduction

We denote a (finite-dimensional) complex Hilbert space b y F . Its elements (vectors) are denoted /, g and the scalar product of

f, g EF

is written (f, g).

The norm of l E E is (f, /)*=11/11. Elements (matrices) of the algebra B ( F ) of endomorphisms on F are denoted by capital letters other t h a n B and F. The norm of

A E B ( F ) i s

defined by IIAli= sup

IIAIiI.II/I1-1.

The adjoint A* of A

f E F

is defined b y

(A/, g)=(/, A'g)

for all /,

g f F .

An element A of B (F) is called normal if it commutes with its adjoint:

AA =AA.

As is well known, A fiB (F) is normal if and only ff it can be written as a sum

m

A = ~ ~k E~, (I.1)

1

where ~ are complex scalars and Ek E B (F) satisfy the conditions

m

~..E~=I; EjEk=O, i:ek; Ek-E=E~.- k*

(I.2)

1

The set spA={2~]E~=~0} is called the spectrum of A. F r o m eqs. (I.1) a n d (I.2) it is easy to conclude t h a t for all polynomials p (t) in one variable t w i t h complex coefficients one has

II v (A)II = max [p (~)1.

Jle spA

(i.3)

According to a theorem of v. N e u m a n n [1], the following converse of (I.3) holds true. If F is a finite subset of the complex plane and

x T h i s article w a s w r i t t e n a t t h e D e p a r $ m e n t of M a t h e m a t i c s , R o y a l I n s t i t u t e of T e c h n o l o g y , Stockholm, while t h e a u t h o r w a s holding a scholarship f r o m t h e S t a t e Council for T e c h n i c a l R e s e a r c h .

(2)

E. A-SPLUND,

Normality for complex matrices

IIp(x)lI= max Ip(~)l

).er

(1.4)

for all polynomials p (t) in one variable, then A is normal a n d s p A ~ I'.

W e call eq. (I.4) a metric criterion of normality. The following criterion

II (p (A)) ~11 = II P (A)II 3 for all polynomials p

(t) (I.5) is equivalent 1 with (I.4), because ( I . 5 ) i m p h e s

Ilp(A)]l= lim ]l(p(A))~IIl'n= max [ ~ l = max [p(~)l

n - - ~ ~ s p ( p (A)) ).e s p a

b y theorems of Gelfand and Dunford. Actually, as every polynomial in A can be replaced b y its residue modulo the minimal polynomial of A, a sufficient condition t h a t A shall be normal is t h a t (I.5) shall hold for every polynomial p (t) of degree less t h a n the minimal polynomial of A.

We will be m a i n l y concerned in this article with a weakened f o r m of (I.5), n a m e l y

IIANII=IIA~II',

A ~ = A - ) . I

(1.6)

for all complex ~t. I t turns out t h a t (I.6) imphes n o r m a l i t y only if d i m F_<4.

Moyls a n d Marcus [2] h a v e given another criterinm of n o r m a h t y , whose do- m a i n of a p p h e a b i h t y coincides with t h a t of eq. (I.6). I f

W (A)={,II,~=(AI, l)(l, 1) -1, /eF}

is the range of values of A, the condition of Moyls a n d Marcus reads: W (A) is equal to t h e convex hull of s p A . T h e y prove t h a t this condition imphes t h a t A is n o r m a l if d i m F < 4 b y representing A as a triangular m a t r i x (Schur's lemma). We give here in the last section a n o t h e r proof of their t h e o r e m which uses no special representation for A.

H. A characterization o f normal matrices for dim F_< 4 II.1.

Introductory remarks

According to the provious section, the condition

[IA~II=IIA~II 2, A ~ = A - ) . I

for all complex 2 ( I I . l . i ) would i m p l y t h a t A is n o r m a l if d i m F = 2. Actually, the validity of eq. (11.1.1) as a criterion of n o r m a l i t y for A reaches further. I t is valid for dim F = 4, and if dim F equals 2 or 3, it is possible to restrict the variation of ~ a n d still h a v e a necessary condition t h a t A shall be normal. Thus, if dim F = 2 a n d eq.

( I I . l . 1 ) holds for one complex value 2 only, t h e n A is n o r m a l a n d the same 1 This equivalency was pointed out to us by Vidar Thom~e.

442

(3)

ARKIV FOR M&TEMATIK. B d 3 nr 40 conclusion holds if dim F = 3 and eq. (11.1.1) is v a h d for all values • on some straight line in the complex plane.

To p r o v e this we need two auxiliary theorems.

Theorem 1. The following two statements are equivalent.

1. II4~II=IIAII ~

2. There is a vector f e F such that A* A f = A A* it = II AII ~ f.

Proof. Suppose t h a t s t a t e m e n t 1 is true. As dim F < oo, there is at least one vector g E F t h a t satisfies

II A ~ g 11 = I1A 11 ~ II g I1 ( I I . 1.2)

when s t a t e m e n t 1 is true. F r o m (II.1.2) one obtains

II A II ~ II g II = II A ~ g II < II A II II A g II < II A II ~ II g II- ( I I . 1 . 3 ) Obviously, equality m u s t hold t h r o u g h o u t in eq. (II.1.3). T h u s the equation

II

4/11 = II A II II/11

(II.1.4)

is satisfied b y b o t h i t = g a n d

it=

A g. However, a necessary (and also sufficient) condition for i t e F to satisfy eq. (11.1.4) is A*A/=I]A[t2It. Using this fact, we verify s t a t e m e n t 2 with it = A g.

Conversely, 2 implies 1. F o r let i t e F satisfy A* A f = A A* it = [I A ]]z/. Then, if one puts A ' i t = g,

[[A~ g[[=(A2 A* It, A2 A* it)½=(A it, A It)½ [[A [I~=IIA II2 (A* A It, /)½=

**IIAIff (AA* f, l)½=llAII2(A* it, A*/)½=llAlffllgll.**

This proves theorem 1.

T h e o r e m 2. A n y vector it~ which satisfies A~' Aa/~ = A~ A~' it~ = II A~II ~ l is in the null space oit A* A - A A*. To prove A normal, one need only exhibit (dim F - 1) line- arly independent vectors lying in the null space oit A * A - A A * .

Prooit. I f A% A~ fa = A~ A% it~- II A,~ I] 2/a, a simple c o m p u t a t i o n shows: t h a t ( A * A - A A * ) f ~ = O . The trace of A A - A A * is, however, zero. Thus if A * A - A A* has zero as a (dim F - 1 ) - t u p l e eigenvalue, the remaining eigenvalue m u s t be zero too. This coricludes the proof of theorem 2.

I I . 2 . The main theorem W e are now r e a d y to prove our m a i n theorem.

Theorem 3. I / A is a subset o/ the complex plane and

/or aU ~ C A , then A E B (F) is normal

(4)

:E. ASPLUND,

Normality for complex matrices 1. trivially i/

dim F = 1.

2. i/ dim F = 2 and A is any point.

3. i/ dim F = 3 and A is any straight line.

4. i[ dim

F = 4 and A is the whole complex plane.

Proo/.

S t a t e m e n t 2 is proved directly b y using theorem 1 and theorem 2.

T o prove statement 3 we have to exhibit two linearly independent vectors /~.

I t turns out t h a t this m a y be accomplished b y using vectors [a belonging to infinite values of 4. These are defined in the following way. Suppose [a, ]l h I I = 1, satisfies A~ A~ h = A~ A~ h = ]t A~ ]l 2 I~, i.e.

A A I ~ - ( i A h + 2Ah)+

I~1~I~ =

II A~II~ h, /

AA h - ( 2 A I ~ + 2Ah)+12I~h =I[A~H2h. J ^(11.2.1)

W e rewrite eqs. (II.2.1) in the following way, using the abbreviations

a/Izl= ,

(S A + w A = A ~ .*

A..h-I;tl- AA h=l;tl- (l r -llA, II ) I,.* J

^(II.2.2)

Taking the difference of the two eqs. (II.2.1) we get (A* A - A A*) h = 0.

I f now ~ tends to infinity in such a way t h a t co approaches a limit, it is possible to pick out a convergent sequence /~n, whose limit /~ is an eigenvector of A~:

A~,/~=m~f~,

(11.2.3)

a n d which also b y continuity has the p r o p e r t y

(A* A - A A*)/~ = 0. (II.2.4)

Moreover, ra~ is the smallest eigenvalue of the self-adjoint m a t r i x A~. We dem- onstrate this b y proving t h a t

A o , - ( m , , - e ) I

is a positive self-a~tjoint m a t r i x for an a r b i t r a r y positive e (the matrix A is said to be positive ff it is self- adjoint and (A [, [)_> 0 for e v e r y vector [). Namely, this m a t r i x is this sum of the three matrices

A~,--[)[]-'AA-IA[-I([~[2-I]A,[I)I, I ; t l - t A A + ~ I*

and ^e

the first of which is positive b y the definition of H A~H. The second and third will be positive for all sufficiently large )t = 2, corresponding to the convergent sequence h . -

444

(5)

ARKIV FOR MATEMXTIK. B d 3 n r 4 0

W e are thus able to obtain two eigenvcctors f~ a n d /_~, corresponding to the eigenvalues m~ a n d m_~ of A~ and A _ ~ = - A~ respectively. B u t m_~, t h e smallest eigenvalue of - A ~ , is obviously the same as the largest eigenvalue M~ of A~. I f m ~ : M~, t h e n for dim F = 3 we h a v e satisfied the requirements of t h e o r e m 2 and s t a t e m e n t 3 is proved. If m ~ = M~, t h e n A ~ =

m,~I

and we h a v e

A = Co m~,I- ~ A,*

which is enough for n o r m a l i t y in a n y case.

When we s t a r t o u t to prove s t a t e m e n t 4 we can thus assume the existence of /~ and ]_~ satisfying

A~ l~ =*no 1~

(II.2.5)

A~,/_,~=M~I_~,, [

with m~ < M~. As A is now the whole complex plane, we can construct in t h e same w a y for an ¢0'=~ +~o eigenvectors /~,, /_~, satisfying

A~,,/.,. = m,~. I~,. I

(II.2.6)

J A,,.t_,~.=M~,,/_,~.

with m~, ~= M~.. Now, either we h a v e enough vectors for use in t h e o r e m 2 to prove A n o r m a l or else / ~ , / _ ~ and /~,,/_~, span the same two-dimensional subspace

F1cF.

As A~ a n d A~, are two independent linear homogeneous func- tions of A and A* we conclude from eqs. (II.2.5) and (II.2.6) t h a t this subspace is reduced b y b o t h A and A*. Thus F 1 is spanned b y two eigenvectors of A corresponding to different eigenvalues 21 a n d 42 (A acts as a n o r m a l m a t r i x on •1 because

A * A - A A *

annihilates F1, therefore 21=2~ would contradict

m~,<M,~).

I f we p u t

o~"=i(21-22) 121-2~1-1,

it is easy to verify t h a t every vector of F 1 is an eigenvector of

A,~,,

with the same eigenvalue, n a m e l y 2 I m ( ) , t 2 z ) [ ~ x - 2 ~ 1 - 1 . W h e n we t h e n r e p e a t the construction of eigenvectors ]~,, a n d /_~,, corresponding to the smallest and largest eigenvalue of A~,, respectively, it is clear t h a t a t least one of f~,, and /_~,, does not belong to F 1 (i.e. the subspace generated b y f~ and /_~) or else

A,~,,=m~,,,I,

i.e. A * =

D"m,,,,I-Co"2A,

a n d in either either case A m u s t be n o r m a l . Thus all state- m e n t s of t h e o r e m 4 are proved.

I I . 3 .

Counterexample .for

d i m F = 5.

We now construct a non n o r m a l m a t r i x A of order 5 such t h a t

[IA II=IIA IL A =A-ZI

(II.3.1)

for all complex ),. L e t F be the direct sum of two m u t u a l l y orthogonal sub- spaces F 1 a n d F~, d i m F 1 = 3 and dim F 2 = 2. L e t A, Az be represented b y the block matrices

(A~-2I~ 01 0

A = ( A1 O ) , A ~ = \ 0

A~-,~Ij=(AI~ A~a)"

(6)

IE. ASPLUND, Normality for complex matrices

F r o m the definition of the n o r m of A it is obvious t h a t

IIA~II = max {IIA1,.II, IIA~II}.

W e choose A 1 to be a normal m a t r i x such t h a t I I A I . I I > - I + I X I •

This can be accomplished b y choosing 2, - 1 + i V-3 as eigenvalues of A r T h e n

(0:)

we t a k e A S to be a non normal m a t r i x of n o r m 1, e.g. 3= which thus 0 '

satisfies

I1 ~2,11_< 1 +1~1

Then IIA~II=

m a x

{IIA~II,

IIA~III=IIA~.II=IIAI~IP=IIA.II ~ which proves the assertion of eq. (II.3.1).

III. Properties of eigenvalues which lie on the boundary of the range of values o f a matrix

Moyls a n d Marcus (2) h a v e p r o v e d t h a t the eigeuvectors corresponding to eigenvalues of A E B (F) lying on the b o u n d a r y of the range of values W ( A ) = {~[~=(A/, [ ) ( [ , / ) - 1 , / e F } are eigenvectors also of A*. I t t h e n follows in t h e same w a y as in the proof of our t h e o r e m 3 t h a t if only one (simple) eigenvalue of A lies in the interior of W (A) (this m u s t always be t h e case when d i m F < 4 and W ( A ) = convex hull of sp A), t h e n A is normal. Moyls a n d Marcus p r o v e their result representing A as a triangular m a t r i x b y m e a n s of Schur's lemma, b u t t h e t h e o r e m is really a consequence of a simple p r o p e r t y of t h e b o u n d a r y of W (A) and can be proved w i t h o u t using a n y special representation.

Theorem 4. I] ] E F is an eigenvector of A E B (F) corresponding to an eige~z- value 2 which lies on the boundary o/ the range of values o[ A, then / is also an eigenvector of A*.

We prove t h e o r e m 5 b y a variational method. P u t / ~ = / + l x g a n d determine d~ = {d [(A 1., It)([~,

/~t)-l]}/x=0

={d[(]tl+/zAg, [ + # g ) ( / + # g , l + # g ) - l ] } . = 0

= [d# (A g , / ) + ~ d # (1, g)] (/,/)-1 _ ~ (1, 1)-1 [d# (g, l) + d # (1, g)]

= d # E(A g , / ) - ~ (g, 1)] ( L / ) - 1

**= d ~ (g, A* 1 - i l) (l, l) -1-**

446

(7)

ARKIY FOR MATEMATIK. B d 3 n r 4 0 Since, however, d ~ is r e s t r i c t e d because 2 lies o n t h e b o u n d a r y of W (A) b u t d # is n o t , we m u s t h a v e

(g, A* f - ~ / ) = 0 for all g E F , w h i c h proves t h e t h e o r e m .

R E F E R E N C E S

1. yON NEUMANN, J., Eine Spektraltheorle ftir allgemeine 0peratoren eines unit/iren Raumes, Math. Nachr. 4, 258-281 (1951).

2. MOYLS, B. N., and MARCUS, M. D., Field convexity of a square matrix, Proc. Amer. Math.

Soc. 6, 981-983 (1955).

Tryckt den 25 juni 1957

Uppsala 1957. Almqvist & Wiksells Boktryckeri AB

II v (A)II = max [p (~)1. (i.3) ~..E~=I; EjEk=O, i:ek; Ek-E*=E~.- k A*A =AA*. (A/, g)=(/, A'g) gfF. AEB(F)is f, g EF Metric criteria of normality for complex matrices of order less than 5