• Aucun résultat trouvé

Distribution of the outliers: proof of Theorem 2.14

Thus we get

α+k+−µα(t)| 6 2kN−2/3ψe−1 (6.36)

for allt∈[0,1]. Choosing

C2 ..= Ce2+ 1, C3 ..= Ce3+ 1 completes the proof of Theorem 2.7 (recall the definition (6.10)).

7. Distribution of the outliers: proof of Theorem 2.14

7.1. Reduction to the law ofGv(i)v(i)(θ(di)). The following proposition reduces the problem to analysing a single explicit random variable.

Proposition 7.1. There is a constant C2, depending on ζ, such that the following holds. Suppose that

|di| 6 Σ−1,

|di| −1

> ϕC2N−1/3

for all i = 1, . . . , k. Suppose moreover that for all i ∈ O (2.24) holds. Recall the definitions (2.16) and

Before proving Proposition 7.1, we record the following auxiliary result.

Lemma 7.2. Let C1 denote the constant from Theorem 2.3. For any x ∈

withζ-high probability. More generally, we have, for any normalized v,w∈CN, ∂xGvw(x)−∂xm(x)hv,wi

6 ϕCζN−1/3κ−1x (7.2)

withζ-high probability.

Proof. By symmetry, we may assume thatx>0. Moreover, (7.2) follows from (7.1) and polarization.

We therefore prove (7.1) for x>0. We have

xGvv(x) = X

with ζ-high probability, where in the last step we used Theorem 3.7. (In the proof of Theorem 2.3, the constant C1 was chosen large enough for this application of Theorem 3.7; see (5.6).) A similar calculation using the definition (2.4) yields

Therefore we get, using Theorem 2.3 and Lemma 3.2,

xGvv(x)−∂xm(x) withζ-high probability. Choosingη..=N−1/6κ3/4yields the claim.

Proof of Proposition 7.1. We only prove the claim for the case di > 1; the case di <−1 is handled similarly.

For 2 +ϕC1N−2/36x6Σ, whereC1 is the constant from Theorem 2.3, we define thek×kHermitian matricesA(x) andA(x) throughe

Aij(x) ..= Gv(i)v(j)(x)−m(x)δij+d−1i δij, Aeij(x) ..= δij

Gv(i)v(i)(x)−m(x) +d−1i .

(Here we subtractm(x)1 so as to ensure that∂xA(x) is well-behaved; see below.) We denote the ordered eigenvalues ofA(x) andA(x) bye a1(x)6· · ·6ak(x) andea1(x)6· · ·6eak(x) respectively.

For the rest of the proof we fixi∈Osatisfyingdi >1. We abbreviateθi..=θ(di). We begin by comparing the eigenvalues ofA(θe i) andD−1. Define the eigenvalue indexr≡r(i) = 1, . . . , kthrough

ear(x) = 1 di

+Gv(i)v(i)(x)−m(x). (7.3)

In particular,

eari) = Gv(i)v(i)i) + 2 di. Theorem 2.3 implies that

Gv(j)v(j)i)−m(θi)

6 ϕCζN−1/2(di−1)−1/2. (7.4) withζ-high probability forj= 1, . . . , k. In particular,

eari)− 1 di

6 ϕCζN−1/2(di−1)−1/2

withζ-high probability. Moreover, (7.4) and the condition (2.24) yield, forj6=i,

Gv(j)v(j)i)−m(θi)

|di−dj| (7.5)

withζ-high probability, providedC2 is chosen large enough. We therefore conclude that minj6=r

eaji)−eari)

> ϕC2−1N−1/2(di−1)−1/2 (7.6) withζ-high probability, providedC2 is large enough.

Next, we compare the eigenvalues of A(θi) and A(θe i) using second-order perturbation theory (the first-order correction vanishes by definition ofAeandA). Theorem 2.3 yields

kA(θi)−A(θe i)k 6 ϕCζN−1/2(di−1)−1/2

withζ-high probability. Therefore (7.6) and nondegenerate second-order perturbation theory yield, for large enoughC2,

ari) = eari) +O

ϕCζN−1(di−1)−1 minj6=r|eaji)−eari)|

= eari) +O

ϕCζ−C2N−1/2(di−1)−1/2

(7.7)

withζ-high probability.

Next, we analyseA(x) and make the link to µα(i). From Lemma 7.2 we find ∂xA(x)

6 ϕCζN−1/3κ−1x withζ-high probability. In particular, we have for allj= 1, . . . , k that

aj(x)−aj(y)

6 ϕCζN−1/3−1x−1y )|x−y| (7.8) withζ-high probability, provided that 2 +ϕC1N−2/36x, y6Σ.

Recall the definition (2.17) ofα(i). From Lemma 6.1 and Theorem 3.7, we know thatµα(i)is characterized by the property that there is aq≡q(i)∈ {1, . . . , k} such that

aqα(i)) = −m(µα(i)). By Theorem 2.7 we have

α(i)−θi| 6 ϕC3N−1/2(di−1)1/2 (7.9) withζ-high probability. ProvidedC2 is large enough (depending onC3), it is easy to see from (7.9) that

µα(i)−2 θi−2 (di−1)2 (7.10)

withζ-high probability. Thus we find, using (7.8), (7.9), and (7.10), that for large enoughC2 we have m(µα(i)) = −aqi) +O

ϕCζN−5/6(di−1)−3/2

(7.11) withζ-high probability. (Here we absorbed the constantC3 intoCζ.)

We now prove that q=rwithζ-high probability providedC2 is large enough. Assume by contradiction thatq6=r. Then we get, using Theorem 2.3 and the condition (2.24), that

aqi)− 1 di

> ϕC2−1N−1/2(di−1)−1/2 (7.12) withζ-high probability. Moreover, (7.8), (7.9), and (7.10) yield

aqi) = aqα(i)) +O

ϕCζN−5/6(di−1)−3/2

= −m(µα(i)) +O

ϕCζN−5/6(di−1)−3/2

= 1 di

+O

ϕCζN−1/2(di−1)−1/2CζN−5/6(di−1)−3/2

with ζ-high probability, where in the last step we used (6.5). Together with (7.12), this yields the desired contradiction providedC2 is large enough. Henceq=r.

Putting (7.3), (7.11), and (7.7) together, we get m(µα(i)) = −Gv(i)v(i)i)− 2

di +O

ϕCζN−5/6(di−1)−3/2Cζ−C2N−1/2(di−1)−1/2

withζ-high probability. Thus we find that, for allxbetweenθi andµα(i), we have m0(x) = m0i) +O(ϕC3N−1/2(di−1)−5/2) = m0i)(1 +O(ϕ−1))

withζ-high probability, where we used (6.5) and (7.9). Using (6.2), (7.10), and (6.5), we conclude that µα(i)−θi = −(1 +O(ϕ−1))Gv(i)v(i)i) +d−1i

m0i) +O

ϕCζN−5/6(di−1)−1/2Cζ−C2N−1/2(di−1)1/2 withζ-high probability. The claim now follows for large enoughC2, using the identity (6.2).

7.2. The GOE/GUE case. By Proposition 7.1, it is enough to analyse the random variable X ..= N1/2(|d|+ 1)(|d| −1)1/2

Gvv(θ) +1 d

, (7.13)

wherev∈CN is normalized,dsatisfies

1 +ϕC2N−1/3 6 |d| 6 Σ−1, (7.14)

and we abbreviatedθ ≡ θ(d). For definiteness, we choosed >1 in the following.

The following notion of convergence of random variables is convenient for our needs.

Definition 7.3. Two sequences of random variables, {AN} and {BN}, are asymptotically equal in distri-bution, denoted AN

d BN, if they are tight and satisfy

N→∞lim Ef(AN)−Ef(BN)

= 0 (7.15)

for all bounded and continuousf.

Remark 7.4. Definition 7.3 extends the notion of convergence in distribution, in the sense that Ef(AN) need not have a limit asN → ∞.

Remark7.5. In order to show thatAN

d BN, it suffices to establish the tightness of either{AN}or{BN} and to verify (7.15) for allf ∈Cc(R). Indeed, if{AN}is tight then so is{BN}, by (7.15). By tightness of AN andBN, we may replace in (7.15) the bounded and continuousf with a compactly supported continuous functiong. Next, we can approximateg uniformly withCc-functions.

Remark 7.6. Clearly,AN

d BN ifAN

=d BN for allN.

Lemma 7.7. Let ANd BN andRN satisfy limNP(|RN|6εN) = 1, where{εN} is a positive null sequence.

ThenAN

d BN+RN.

Proof. By Remark 7.5, it suffices to prove (7.15) forf ∈C1(R) such thatf andf0 are bounded. Then Ef(AN)−Ef(BN+RN) = Ef(AN)−Ef(BN)

+ Ef(BN)−Ef(BN+RN)

= o(1) +E

h1(|RN|6εN) f(BN)−f(BN +RN)i

= o(1) where in the last step we used the boundedness off0.

Lemma7.8. Let{AN},{A0N},{BN}, and{BN0 }be sequences of random variables. Suppose thatANd A0N, BNd BN0 ,AN andBN are independent, andA0N andBN0 are independent. Then

AN +BN

d A0N +BN0 .

Proof. Without loss of generality, we may assume thatAN, BN, A0N, BN0 are independent (after replacing A0N andBN0 with new random variables without changing their laws.) Then for anyλ∈Rwe have

Eeiλ(AN+BN)−Eeiλ(A0N+B0N) = E h

eiλAN(eiλBN −eiλBN0 ) + (eiλAN −eiλA0N)eiλBN0 i

= EeiλANE(eiλBN −eiλBN0 ) +E(eiλAN −eiλA0N)EeiλBN0

−→ 0 asN → ∞.

Next, we observe thatAN+BN andA0N+B0N are tight. Therefore, recalling Remark 7.5, we find that it suffices to prove

Ef(AN+BN)−Ef(A0N +BN0 ) −→ 0 f ∈Cc. Denoting by ˆf the Fourier transform off, we find

Ef(AN +BN)−Ef(A0N+B0N) = Z

dλfˆ(λ)h

Eeiλ(AN+BN)−Eeiλ(A0N+BN0 )i

−→ 0 by dominated convergence.

Proposition 7.9. Let H be a GOE/GUE matrix. Assume that dsatisfies (7.14). Then for large enough C2 we have

X ∼ Nd

0,2(d+ 1) βd2

.

Proof. By unitary invariance, we have Gvv =d G11, where = denotes equality in distribution.d In or-der to handle the exceptional low-probability events, we add a small imaginary part to the spectral pa-rameter z ..= θ+ iN−4. Throughout the following we abbreviate G ≡ G(z) and m ≡ m(z). Writing a..= (h12, h13, . . . , h1N), we get from Schur’s formula and (2.5) that

G11 = 1

h11−z−aG(1)a = 1

−m−z+h11− aG(1)a−m

= m−m2h11+m2 aG(1)a−m

+O(|h11|2) +O

aG(1)a−m

2

(7.16) with ζ-high probability. Again by unitary invariance, we have aG(1)a =d kak2G(1)22. Moreover, both sides are independent ofh11, so that

−m2h11+m2 aG(1)a−m d

= −m2h11+m2 kak2G(1)22 −m

. (7.17)

In order to estimate the error term in (7.16), we write kak2G(1)22 −m = kak2−1

G(1)22 + (G(1)22 −m). (7.18)

Using (3.6) to estimateG(1)22 −G22, as well as Theorem 2.3, Lemma 3.5, and Lemma 3.2, we therefore find that

kak2G(1)22 −m

6 ϕCζN−1/2(d−1)−1/2 (7.19)

withζ-high probability. Moreover, we have the trivial boundE

kak2G(1)22 −m

k6(kN)Ck fork∈N. From (7.16), (7.17), (7.18), and (7.19), we conclude that there exist random variablesRe1andRe2satisfying

|Re1|+|Re2| 6 ϕCζN−1(d−1)−1 (7.20) withζ-high probability, the rough bound

E(|Re1|+|Re2|)k 6 (kN)Ck, (7.21) and

G(2)11 −m

+Re1 = −m2h11+m2 aG(1)a−m

=d −m2h11+m2 kak2G(1)22 −m

= −m2h11+m3 kak2−1

+m2(G(1)22 −m) +Re2. Defining

Y1 ..= N1/2(d+ 1)(d−1)1/2Re G(2)11 −m

, Y2 ..= N1/2(d+ 1)(d−1)1/2Re G(1)22 −m , W ..= N1/2Re

−m2h11+m3 kak2−1

, Ri ..= N1/2(d+ 1)(d−1)1/2ReRei (i= 1,2), we therefore get

Y1+R1 = (dd + 1)(d−1)1/2W +m2Y2+R2. (7.22) In order to infer the distribution of Y1 from (7.22), we observe that the random variablesY2 andW are independent. Also,Y1

=d Y2. Recalling Theorem 2.3 and (3.6), we find the bounds

|Yi| 6 ϕCζ, |Ri| 6 ϕCζN−1/2(d−1)−1/2 (i= 1,2) (7.23) withζ-high probability, and the rough bounds

|Yi| 6 N2, E|Ri|k 6 (kN)Ck (i= 1,2). (7.24) Moreover, by the Central Limit Theorem

2(d2+ 1) βd6

−1

W ∼ Nd (0,1), (7.25)

where we used (6.2).

Next, let B andZ2 be independent random variables whose laws are given by B =d N

0,2(d2+ 1) βd6

, Z2 =d N(0, ξ2),

where we introduced

ξ2 ≡ ξ2N ..= d4(d2−1)(d+ 1) d4−1

2(d2+ 1)

βd6 = 2(d+ 1) βd2 . Defining

Z1 ..= (d+ 1)(d−1)1/2B+d−2Z2, (7.26) we find thatZ1

=d Z2. Moreover, a standard moment calculation and the definition ofW yield lim

N→∞(EWk−EBk) = 0 ; (7.27)

as usual, only the pairings in the moment expansion of EWk survive the limit N → ∞. (See also (7.25), which however cannot be used to deduce (7.27) directly.)

We now compare the distributions ofY1andZ1by computing moments. Note that the family{EZ1k}NN is bounded for eachk∈N. We claim that

lim

N→∞ EY1k−EZ1k

= 0 (7.28)

for all k ∈ N. (This will imply that Y1

d Z1.) We shall prove (7.28) by induction on k. Taking the expectation of (7.22) yields

EY1 = m2EY1+O

ϕCζN−1/2(d−1)−1/2 where we used (7.23), (7.24), andEW =O(N−1/2). Therefore

EY1 6 CϕCζN−1/2(d−1)−3/2 = o(1) providedC2 in (7.14) is large enough. Here we used that

m(z) = d−1+O(N−3), (7.29)

as follows from the definition of z = θ+ iN−4, (5.5), Lemma 3.2, and (6.2). Therefore (7.28) for k = 1 follows usingEZ1= 0.

For the induction step, we assume that (7.28) holds for allk0 6k−1. From (7.22) we find

EY1k+

k

X

l=1

k l

E Rl1Y1k−l

= E (d+ 1)(d−1)1/2W+m2Y2k +

k

X

l=1

k l

E

Rl1 (d+ 1)(d−1)1/2W +m2Y2k−l

. (7.30) We estimate the summands on the left-hand side by

E Rl1Y1k−l

6 NCexp(−ϕC) +

ϕCζN−1/2(d−1)−1/2l

E|Y1|k−l 6 C

ϕCζN−1/2(d−1)−1/2l ϕCζ 6 ϕCζN−1/2(d−1)−1/2,

where in the first step we used (7.23) and (7.24), in the second step the estimateE|Y1|k−lCζ as follows from the induction assumption (7.28) applied to even moments (recall thatY1is real) as well as (7.23) and (7.24), and in the third step the fact thatl >1. Note that the constant Cζ is independent of k. A similar estimate applies to the summands on the right-hand side of (7.30). Thus (7.30) yields

EY1k = E (d+ 1)(d−1)1/2W +m2Y2k where in the second step we used the induction assumption and the estimateEW =O(N−1/2). Therefore we get

In order to conclude the proof of (7.28), we deduce from (7.26) that EZ1k = 1

Using the induction assumption (7.28) fork0 =k−l, (7.29), and the condition l >2, we get from (7.31), (7.32), and (7.27) that

lim

N→∞ EY1k−EZ1k

= 0 for large enoughC2. This concludes the proof of (7.28).

Next, by definition we have ξ−1Z1 =d N(0,1). Moreover, we have that ξ ∈ [c, C] for some positive

for any continuous bounded functionf. Next, we estimate G11(θ)−G(2)11(z) withζ-high probability, where in the second step we used Lemma 7.2, (5.5), and Lemma 3.2 to estimate the first term, and Theorem 2.3 and (6.1) to estimate the second term. Therefore

X =d N1/2(d+ 1)(d−1)1/2 G11(θ) +d−1

= Y1+O ϕCζN−1/2(d−1)−1/2

= Y1+o(1)

with ζ-high probability, where in the second step we used (7.29). Therefore (7.33), the fact that Z =d Z1, and dominated convergence yield

lim

N→∞ Ef(ξ−1X)−Ef(ξ−1Z)

= 0. (7.34)

The claim now follows from Lemma 7.10 below.

Lemma 7.10. Let {ξN} be a bounded deterministic sequence. Let A, A1, A2, . . . be random variables such that AN converges weakly to A. Then we have for any bounded continuous functionf

Ef(ξNAN)−Ef(ξNA) −→ 0 asN → ∞.

Proof. By Skorokhod’s representation theorem, there exist new random variablesAe,Ae1,Ae2, . . . such that A

=d Ae, AN

=d AeN for allN ∈N, and AeN →Ae almost surely. Let ω be such that AeN(ω)→Ae(ω).

By assumption onξN, we find that there exists aC≡C(ω) such thatξNAeN(ω)∈[−C, C] andξNAe(ω)∈ [−C, C] for allN ∈N. Sincef is uniformly continuous on [−C, C], we find that

lim

N→∞

f(ξNAeN(ω))−f(ξNAe(ω))

= 0. The claim now follows by dominated convergence.

7.3. The almost-GOE/GUE case. As it turns out, replacing the matrix element hij with a Gaussian in the Green function comparison step below (Section 7.4) is only possible if|vi|6ϕ−D and |vi| 6ϕ−D, for some large enough constant D >0. If this assumption is not satisfied, we first have to replace hij with a Gaussian using a different method, which effectively keeps track of the fluctuations of Gvv resulting from large components ofv. Thus we shall proceed in two steps:

(i) We compare the original Wigner matrix H withHb, a Wigner matrix obtained from H by replacing the (i, j)-th entry ofH with a Gaussian whenever |vi|6ϕ−D and|vj|6ϕ−D.

(ii) We compare the matrixHb to a Gaussian matrix.

The step (ii) is performed in this section. To simplify notation, we write H instead of Hb throughout this section. The step (i) is performed using Green function comparison in Section 7.4 below.

The following shorthand will prove useful.

Definition7.11. Let{σN}be a bounded positive sequence. IfAN andBN are independent random variables withBN

∼ Nd (0, σN2), and if SN

d AN +BN, then we write SN

d AN+N(0, σN2). For the following we write

X = νN1/2

Gvv(θ) +1 d

, ν ≡νN ..= (d+ 1)(d−1)1/2.

Proposition7.12. FixD >0. Letv∈CN be normalized andH be a Wigner matrix such that if|vi|6ϕ−D and|vj|6ϕ−D thenhij is Gaussian. Then we have

X ∼ −νNd 1/2d−2hv, Hvi+N 0, 2(d+ 1)

βd4 +4ν2Q(v)

d52R(v) d6

! ,

whereQ(v)andR(v)were defined in (2.22).

Proof. As before, we consistently drop the spectral parameterz=θ from our notation.

Let M ∈ N denote the number of entries of v satisfying|vi| > ϕ−D. Since v is normalized, we have M 6ϕ2D. To simplify notation, we assume (after a suitable permutation of the rows and columns of H) that the entries ofvsatisfy|vi|> ϕ−D fori6M and|vi|6ϕ−D fori > M. Splitv= wu

, whereu∈CM andw∈CN−M. (Throughout the following we assume that w6= 0; the casew= 0 may be easily handled by approximation with nonzerow.) We also split

H =

where the second equality defines the right-hand side using self-explanatory notation. Note that, by defini-tion,kxk=kvk= 1.

Next, we claim that

(FF)ij = δij+O ϕCζN−1/2

(7.36) withζ-high probability. In order to prove (7.36), write

FF = BSeSBe BSea aSBe aa

! .

We consider four cases. First, if 16i6=j6M we find using (3.15) that

withζ-high probability. Second, if 16i6M we find using (3.13) and (3.14) that (FF)ii−1

withζ-high probability. This completes the proof of (7.36).

Next, abbreviateG1(z) ..= (H1−z)−1. SinceN1/2(N−M−1)−1/2H1is an (N−M−1)×(N−M−1) GOE/GUE matrix, we find from (7.36), Theorem 2.3, and Lemma 3.2 that

withζ-high probability. Therefore Schur’s formula yields Γ = x with ζ-high probability, where in the second step we expanded using (2.5), and estimated the error term using (7.37) as well as the boundsM 6ϕCζ and|Eij|6ϕCζN−1/2. Recalling thatkxk= 1, we find

Applying (3.12) tohw, Bui=P

i,jwiujBij (withN in (3.12) replaced byM(N−M)), we find hw, Bui

2 6 ϕCζN−1. Similarly, using (3.13) and (3.14) we find that

kBuk2 = kuk2+O ϕCζN−1/2

, kSBuke 2 = kuk2+O ϕCζN−1/2 withζ-high probability, using (3.15) that

Bu,Sea

6 ϕCζN−1/2 withζ-high probability, and using (3.13) that

kak2 = 1 +O ϕCζN−1/2

withζ-high probability. Usingkuk>ϕ−D (by definition ofu), we therefore conclude that kFxk2 = kBuk2+ 2 Re kukkwk

kSBe ukkak

SBu,e a

+kwk2kak2+O ϕCζN−1

= 1 +O ϕCζN−1/2

(7.40) with ζ-high probability. Using Theorem 2.3 applied to G1 (recall that F and H1 are independent), we therefore get from (7.39) that

Γ−m = −m2

hu, Aui+kwk2g+ 2 Rehw, Bui

+m2 1

kFxk2hFx, G1Fxi −m

!

+m3 kBuk2− kuk2+ 2 Re kukkwk kSBukkake

SBu,e a

+kwk2 kak2−1

! +O

ϕCζN−1(d−1)−1

(7.41)

withζ-high probability. We write this as

Γ−m = Γ1+· · ·+ Γ6+O

ϕCζN−1(d−1)−1

(7.42) withζ-high probability, where

Γ1 ..= −m2hu, Aui, Γ2 ..= −m2kwk2g , Γ3 ..= m2 1

kFxk2hFx, G1Fxi −m

! ,

Γ4 ..= −2m2Rehw, Bui+m3 kBuk2− kuk2

, Γ5 ..= 2m3Re kukkwk kSBukkake

SBu,e a , Γ6 ..= m3kwk2 kak2−1

.

We now claim that Γ1, . . . ,Γ6are independent. In order to prove this, letf1, . . . , f6be indicator functions of Borel sets in R. Write a=aωin polar coordinates, wherea >0 andω∈SN−M−2. Sincea is Gaussian,

aandωare independent. Denote by ρ1, . . . , ρ6 the laws ofA, B, g, a,ω, H1 respectively. Then we get

where the second equality follows by definition of the Γ’s, and the third from the invariance of the law of ω under rotations (applied to Γ5) and from the invariance of the law of H1 under orthogonal/unitary conjugations (applied to Γ3). This proves the independence of Γ1, . . . ,Γ6. The variance of the term in parentheses is

E

where we abbreviated

Q(w,u) ..= N−1/2ReX

i,j

wiMij(3)uj|uj|2, R(u) ..= 1 N

X

i,j

Mij(4)−4 +β

|uj|4,

and used that 3−β = 2β−1 for β = 1,2. Since kuk 6 1 and kwk 6 1, we find that Q(w,u) 6 C and R(u)6C for some positive constantC. Next, using Γ5= 2m3kwkRehSBue ,ai+O(ϕCζN−1) withζ-high probability and

E 2 RehSBu,e ai2

= 4

βN2(N−M−1)kuk2, we find from the Central Limit Theorem and Lemma 7.10 that

νN1/2Γ5

∼ Nd 0,4ν2β−1m6kuk2kwk2

. (7.46)

Finally, we havekak2−1 =kak2−Ekak2+O(M/N) and E |ai|2−E|ai|22

= 2β−1N−2. Thus we conclude from the Central Limit Theorem and Lemma 7.10 that

νN1/2Γ6

∼ Nd 0,2ν2β−1m6kwk4

. (7.47)

Next, (7.43) – (7.47) imply that νN1/2Γ2, . . . , νN1/2Γ6 are tight (as N-dependent random variables).

Moreover, an easy variance calculation shows that νN1/2Γ1 is also tight. Therefore we get from (7.35), (7.42), (7.43) – (7.47), Lemma 7.7, and Lemma 7.8 that (recall the notation from Definition 7.11)

X ∼ −νNd 1/2m2hu, Aui+N(0, V1), where

V1 ..= 2(d+ 1) βd6 + 2ν2

βd4 kwk4+ 2kuk2kwk2 +4ν2

d5 Q(w,u) +ν2

d6R(u) + 2ν2 βd6

kuk4+ 2kuk2kwk2+kwk4 . Here we used (6.2).

Next, from

hv, Hvi = hu, Aui+ 2 Rehw, Bui+hw, H0wi, the Central Limit Theorem, Lemma 7.10, and Lemma 7.8 we find

νN1/2hv, Hvi ∼d νN1/2hu, Aui+N

0,2ν2

β (1− kuk4)

. (7.48)

Moreover, using that the dimensionM ofusatisfiesM 6ϕ2D and the fact that maxi|wi|6ϕ−D, we find Q(w,u) = Q(v) +O(ϕ−D), R(u) = R(v) +O(ϕ−2D).

Therefore we get, using Lemma 7.8 and recalling that 1 =kvk2=kuk2+kwk2, X ∼ −νNd 1/2d−2hv, Hvi+N

0, 2(d+ 1) βd6 +4ν2

d5 Q(v) +ν2

d6R(v) + 2ν2 βd6

. This concludes the proof.

7.4. Conclusion of the proof of Theorem 2.14. In this section we compute the distribution ofGvv(θ)−m(θ) for a general Wigner matrixH, and hence complete the proof of Theorem 2.14. We use the Green function comparison method from the proof of Lemma 3.9.

LetH = (hij) = (N−1/2Wij) be an arbitrary real symmetric / Hermitian Wigner matrix,V = (N−1/2Vij) a GOE/GUE matrix independent ofH, and v∈CN be normalized. For D >0 define the subset

ID ..=

i= 1, . . . , N ..|vi|6ϕ−D . Define a new Wigner matrixHb = (bhij) = (N−1/2cWij) through

Wcij ..=

(Vij ifi∈ID andj ∈ID

Wij otherwise. Thus,Hb satisfies the assumptions of Proposition 7.12. Let

JD ..=

16i6j6N..i∈ID andj∈ID

be the set of matrix indices to be replaced. Similarly to (3.21), we choose a bijective map φ .. JD → {1, . . . , γmax(D)} and denote byHγ = (hγij) the matrix defined by

hγij ..=

(N−1/2Wij ifφ(i, j)6γ N−1/2Wcij otherwise.

In particular, H0 =Hb andHγmax(D)=H. Let now (a, b)∈JD satisfy φ(a, b) = γ. Similarly to (3.22), we write

Hγ−1 = Q+N−1/2V where V ..= VabE(ab)+1(a6=b)VbaE(ba), and

Hγ = Q+N−1/2W where W ..= WabE(ab)+1(a6=b)WbaE(ba).

In order to avoid singular behaviour on exceptional low-probability events, we add a small imaginary part to the spectral parameterθ, and setz..=θ+ iN−4. Abbreviate

x ..= νN1/2Re(Gvv(z)−m(z)). (7.49) Thus we have the rough bound|x|6N4 which we shall tacitly use in the following. We use the notation (3.23), which gives rise to the quantities xR, xS, xT defined through (7.49) with G replaced by R, S, T respectively. We may now state the main comparison estimate.

Lemma 7.13. Provided D is a large enough constant, the following holds. Letf ∈ C3(R)be bounded with bounded derivatives and q≡qN be an arbitrary deterministic real sequence. Then

Ef(xT+q) = Ef(xR+q) +YabEf0(xR+q) +Aab+O ϕ−1Ebab

, (7.50)

Ef(xS+q) = Ef(xR+q) +Aab+O ϕ−1Ebab

, (7.51)

whereAab satisfies|Aab|6ϕ−1,

Yab ..= −νN−1Re

m4Mab(3)vavb+m4Mba(3)vbva

,

and

Ebab ..=

2

X

σ,τ=0

N−2+σ/2+τ /2|va|σ|vb|τab 2

X

σ=0

N−1+σ/2|va|σ.

Before proving Lemma 7.13, we show how it implies Theorem 2.14.

Proof of Theorem 2.14. FixD >0 large enough that the conclusion of Lemma 7.13 holds. By Remark 7.5, we may assume thatf ∈Cc(R). Letγ=φ(a, b). Since|va|6ϕ−D and|vb|6ϕ−D, we find

Yab2 6 ϕ−1Ebab. (7.52)

Applying (7.50) and (7.51) withf replaced byf0 yields

YabEf0(xT +q) = YabEf0(xR+q) +YabAab+O ϕ−1Ebab

. Subtracting this from (7.50) and using|Aab|6ϕ−1 yields

Ef(xT +q) = Ef(xR+q) +YabEf0(xT +q) +Aab+O ϕ−1Ebab−1|Yab| . Subtracting (7.51) yields

Ef(xγ+q) = Ef(xγ−1+q) +YabEf0(xγ+q) +O ϕ−1Ebab−1|Yab| , where we introduced the notationxγ ..=νN1/2Re (Hγ−z)−1vv−m(z)

. Using (7.52) we therefore get Ef(xγ+q−Yab) = Ef(xγ−1+q) +O ϕ−1Ebab−1|Yab|

. (7.53)

We now iterate (7.53), starting atγ = 1 andq= 0. Using that P

a,bEbab6C and P

a,b|Yab|6C, we find afterγmax iterations of (7.53)

Ef xγmax(D)

γmax(D)

X

γ=1

Yϕ−1(γ)

!

= Ef(x0) +O(ϕ−1).

Moreover, using|va|6ϕ−D and|vb|6ϕ−D, we find that

γmax(D)

X

γ=1

Yϕ−1(γ) = −νN−1Re X

a,b∈ID

1(a6b)m(z)4

Mab(3)vavb+Mba(3)vbva

= −νN−1Re

N

X

a,b=1

m(z)4Mab(3)vavb+O(ϕ−2D). Using Lemma 7.8 we find

νN1/2

(H−z)−1vv −m(z) d

∼ νN1/2

(Hb −z)−1vv−m(z)

−νN−1Re

N

X

a,b=1

m(z)4Mab(3)vavb.

Using Lemma 7.2, it is now easy to remove the imaginary partN−4 ofz to get νN1/2

(H−θ)−1vv+d−1 d

∼ νN1/2

(Hb −θ)−1vv+d−1

−νS(v) d4 . SinceHb satisfies the assumptions of Proposition 7.12, we find

νN1/2

(H−θ)−1vv+d−1 d

∼ −νN1/2hv, Hvi

d2 −νS(v)

d4 +N 0, 2(d+ 1)

βd4 +4ν2Q(v)

d52R(v) d6

! ,

using the notation of Definition 7.11. Now Theorem 2.14 follows from Proposition 7.1 and Lemma 7.7.

Proof of Lemma 7.13. As before, we consistently drop the spectral parameterz=θ+ iN−4fromGand m. We focus on (7.50). From Theorem 2.3, (3.29), and (3.28) (withS replaced byT), we find

|Tva| 6 ϕCζN−1/2(d−1)−1/2+C|va|, |Rva| 6 ϕCζN−1/2(d−1)−1/2+C|va|+ϕCζN−1/2|vb| (7.54) with ζ-high probability, and similar results hold for Tav, Tvb, Tbv, Rav,Rvb, andRbv. Similarly, from the first inequality of (3.30) (withS replaced byT), we get

Tvv−Rvv

6 ϕCζN−1/2

N−1(d−1)−1+N−1/2(d−1)−1/2(|va|+|vb|) +|va||vb|+N−1/2(|va|2+|vb|2) withζ-high probability. This yields

|xT−xR| 6 ϕCeζh

N−1(d−1)−1/2+N−1/2(|va|+|vb|) + (d−1)1/2|va||vb|+N−1/2(d−1)1/2(|va|2+|vb|2)i (7.55) withζ-high probability for some constantCeζ. Now chooseD>Ceζ+ 1. By definition of JD, we have that

|va|6ϕ−D and|vb|6ϕ−D. Therefore

|xT−xR|3 6 ϕ−1Ebab withζ-high probability. This yields

Ef(xT +q) = Ef(xR+q) +E f0(xR+q)(xT−xR) +1

2E f00(xR+q)(xT−xR)2

+O ϕ−1Ebab

. (7.56) In order to analysexT −xR=νN1/2Re(Tvv−Rvv), we write

xT −xR = y1+y2+y3+y4, where

yk ..=

(νN1/2−k/2Re (−RW)kR

vv ifk= 1,2,3 νN−3/2Re (−RW)4T

vv ifk= 4.

Using (7.54), it is easy to check thaty1is bounded by the right-hand side of (7.55), and that

|yk| 6 ϕCζN1/2−k/2

N−1(d−1)−1/2+ (d−1)1/2 |va|2+|vb|2

(k= 2,3,4) (7.57)

withζ-high probability. In particular,

xT −xR = y1+y2+y3+O ϕ−1Ebab

withζ-high probability. Moreover, using,|va|6ϕ−D, |vb|6ϕ−D, (7.57) fork= 2, and the fact that y1 is bounded by the right-hand side of (7.55), we find that

|y1||y2| 6 ϕ−1Ebab

withζ-high probability, providedD is chosen large enough. Similarly, using (7.57) we find that|yk||yk0|6 ϕ−1Ebabfork, k0 >2 for large enough D. Thus we conclude from (7.56) that

Ef(xT +q) = Ef(xR+q) +E f0(xR+q)y3

+Aab+O ϕ−1Ebab , where

Aab ..= E f0(xR+q)(y1+y2) +1

2E f00(xR+q)y21

depends on the randomness only throughR and the first two moments ofWab. Moreover, from (7.57) and the fact thaty1 is bounded by the right-hand side of (7.55), we conclude that|Aab|6ϕ−1.

What remains is the analysis of the term E f0(xR+q)y3

. We shall prove that

E f0(xR+q)y3

−YabEf0(xR+q)

6 Cϕ−1Ebab. (7.58)

Ifa=b, it is easy to see from (7.57) and the definition ofYabthat

|y3|+|Yab| 6 ϕ−1Ebab, from which (7.58) follows.

Let us therefore assume thata6=b. We multiply out the matrix product in (−RW)3R

vv and regroup the resulting eight terms according to the number, r, of off-diagonal matrix elements (Rab or Rba) of R.

(By convention, the endpoint matrix elements R andR·v are not counted as off-diagonal.) This gives, in self-explanatory notation,y3=P2

r=0y3,r. Using Theorem 2.3 and (7.54), we find

|y3,1|+|y3,2| 6 ϕCζN−3/2

N−1/2(d−1)−1/2+|va|+|vb|2

6 ϕ−1Ebab

withζ-high probability . Therefore it suffices to prove that

E f0(xR+q)y3,0

−YabEf0(xR+q)

6 Cϕ−1Ebab (7.59)

fora6=b. By definition,

y3,0 = −νN−1Re

RvaWabRbbWbaRaaWabRbv+RvbWbaRaaWabRbbWbaRav . Using (7.54) and Theorem 2.3 we find

y3,0+νN−1Re

m2|Wab|2 RvaWabRbv+RvbWbaRav

6 ϕ−1Ebab

withζ-high probability. We only deal with the first term of y3,0; the second one is dealt with analogously.

Recalling the definition ofYab, we conclude that, in order to establish (7.59), it suffices to prove withζ-high probability; here we used thatR is independent ofWab.

Setting u= (ui) withui..=1(i /∈ {a, b})vi and recalling (3.9) and (3.10), we get see (4.20). The second and third terms are estimated using (7.54) and Theorem 2.3:

|Raa−m|+|Rba| 6 ϕCζN−1/2(d−1)−1/2 (7.62) with ζ-high probability. Moreover, sinceR(a)=T(a), we find from Lemma (3.12), Theorem 2.3, and (3.8) that

withζ-high probability. A similar estimate holds forRbu. Using (7.61), (7.62), (7.63), and (7.54) we get νN−1

What remains is to estimate the right-hand side of (7.64). Defining x(a)R ..= νN1/2Re(R(a)vv −m), we find from (3.8) and (7.54) that

xR−x(a)R

6 ϕCζ

N−1/2(d−1)−1/2+N1/2(d−1)1/2|va|2+N−1/2(d−1)1/2|vb|2

with ζ-high probability. Using (7.63) and using that the derivative of f is bounded, we may estimate the first term of (7.64) as

νN−1

with ζ-high probability. In the second step we used that x(a)R is independent of the the a-th column of Q and thatEaRua= 0. The second term of (7.64) is similar. In order to estimate the third, we have to make Rbuindependent of thea-th column ofQ. (See the definition (3.22).) We estimate, using (3.8),R(b)=T(b), (3.12), and (7.54) withζ-high probability. Thus we may estimate the third term of (7.64) by

νN−1 where in the second step we again used that EaRua = 0. This concludes the proof of (7.60), and hence of (7.50).

The proof of (7.51) is almost identical to the proof of (7.50), except that E|Vab|2Vab = 0, so that the left-hand side of the analogue of (7.60) vanishes. Note that, by definition, Aab depends only on R and on the first two moments ofWab, which coincide with those ofVab. HenceAabis the same in (7.50) and (7.51).

This concludes the proof.

References

[1] Z.D. Bai and J.F. Yao,Limit theorems for sample eigenvalues in a generalized spiked population model, Preprint arXiv:0806.1141.

[2] ,Central limit theorems for eigenvalues in a spiked population model, Ann. Inst. H. Poincar´e (B) 44(2008), 447–474.

[3] J. Baik, G. Ben Arous, and S. P´ech´e, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices, Ann. Prob.33(2005), 1643–1697.

[4] J. Baik and J.W. Silverstein, Eigenvalues of large sample covariance matrices of spiked population models, J. Multivar. Anal.97(2006), 1382–1408.

[5] F. Benaych-Georges, A. Guionnet, and M. Ma¨ıda,Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices, Preprint arXiv:1009.0145.

[6] ,Large deviations of the extreme eigenvalues of random deformations of matrices, Prob. Theor.

Rel. Fields (2010), 1–49.

[7] F. Benaych-Georges and R.R. Nadakuditi,The eigenvalues and eigenvectors of finite, low rank pertur-bations of large random matrices, Adv. Math.227 (2011), 494–521.

[8] A. Bloemendal,Finite rank perturbations of random matrices and their continuum limits, Ph.D. thesis, University of Toronto, 2011.

[9] A. Bloemendal and B. Vir´ag,Limits of spiked random matrices I, Preprint arXiv:1011.1877.

[10] ,Limits of spiked random matrices II, Preprint arXiv:1109.3704.

[11] M. Capitaine, C. Donati-Martin, and D. F´eral, Central limit theorems for eigenvalues of deformations of Wigner matrices, Preprint arXiv:0903.4740.

[12] , The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations, Ann. Prob.37(2009), 1–47.

[13] M. Capitaine, C. Donati-Martin, D. F´eral, and M. F´evrier,Free convolution with a semi-circular distri-bution and eigenvalues of spiked deformations of Wigner matrices, Preprint arXiv:1006.3684.

[14] S. Chatterjee,A generalization of the Lindeberg principle, Ann. Prob.34(2006), 2061–2076.

[15] L. Erd˝os, A. Knowles, H.T. Yau, and J. Yin,Spectral statistics of Erd˝os-R´enyi graphs I: Local semicircle law, to appear in Ann. Prob. Preprint arXiv:1103.1919.

[16] , Spectral statistics of Erd˝os-R´enyi graphs II: Eigenvalue spacing and the extreme eigenvalues,

[16] , Spectral statistics of Erd˝os-R´enyi graphs II: Eigenvalue spacing and the extreme eigenvalues,

Documents relatifs