• Aucun résultat trouvé

Random band matrices in the delocalized phase, III: Averaging fluctuations

N/A
N/A
Protected

Academic year: 2022

Partager "Random band matrices in the delocalized phase, III: Averaging fluctuations"

Copied!
64
0
0

Texte intégral

(1)

Random band matrices in the delocalized phase, III: Averaging fluctuations

F. Yang

University of Pennsylvania fyang75@wharton.upenn.edu

J. Yin

University of California, Los Angeles jyin@math.ucla.edu

We consider a general class of symmetric or Hermitian random band matricesH= (hxy)x,y∈

J1,NKd in any dimensiond>1, where the entries are independent, centered random variables with variances

sxy=E|hxy|2. We assume thatsxy vanishes if|xy|exceeds the band widthW, and we are interested in the mesoscopic scale with1W N. Define thegeneralized resolventof H asG(H, Z) := (HZ)−1, where Z is a deterministic diagonal matrix with entries Zxx C+ for all x. Then we establish a precise high-probability bound on certain averages of polynomials of the resolvent entries. As an application of this fluctuation averaging result, we give a self-contained proof for the delocalization of random band matrices in dimensionsd>2. More precisely, for any fixedd>2, we prove that the bulk eigenvectors of H are delocalized in certain averaged sense if N 6W1+d2. This improves the corresponding results in[23]under the assumptionNW1+d+1d , and in[11, 12]under the assumption

NW1+d6. For 1D random band matrices, our fluctuation averaging result was used in[5, 6]to prove the delocalization conjecture and bulk universality for random band matrices withNW4/3.

1 Introduction ... 2

1.1 Random band matrices. . . 2

1.2 Averaging fluctuations. . . 3

1.3 Basic ideas. . . 5

2 Main results... 12

2.1 The model. . . 12

2.2 The main results. . . 14

2.3 Proof of Theorem 2.7. . . 17

3 Basic tools ... 19

4 Proof of Lemma 2.14 ... 23

5 Graphical tools - Part I ... 27

5.1 Definition of Graph - Part 1. . . 27

5.2 Proof of Lemma 5.1. . . 33

5.3 Proof of Lemma 4.3. . . 37

6 Graphical tools - Part II ... 39

6.1 Definition of Graph - Part 2. . . 39

6.2 Proof of Lemma 2.15: step 1. . . 41

6.3 Proof of Lemma 2.15: step 2. . . 43

6.4 Proof of Lemma 2.15: step 3. . . 51

6.5 Proof of Lemma 2.15: step 4. . . 53

A Proof of (2.34)... 58

The work of J. Yin is partially supported by the NSF grant DMS-1552192.

(2)

B Proof of Lemma 6.15 ... 60

1 Introduction

1.1 Random band matrices. Random band matrices H = (hxy)x,y∈Γ model interacting quantum sys- tems on a large finite graph Γ of scale N with random transition amplitudes effective up to scale of order W N. More precisely, we consider random band matrix ensembles with entries being centered and inde- pendent up to the symmetry condition hxy =hyx. The variance sxy :=E|hxy|2 typically decays with the distance between xandy on a characteristic length scale W, called the band widthof H. For the simplest one-dimensional model with graph Γ ={1,2,· · · , N} andhxy= 0 for|x−y|>W, we have a band matrix in the usual sense that only the matrix entries in a narrow band of width 2W around the diagonal can be nonzero. In particular, ifW =N/2 and all the variances are equal, we recover the famous Wigner matrix ensemble, which corresponds to a mean-field model.

In this paper, we consider the case where Γ is ad-dimensional torusZdN :={1,2,· · · , N}d withd>1, so that the dimension of the matrix isNd(with an arbitrary ordering of the lattice points). Typically, we take the band widthW to be of mesoscopic scale 1W N. The band structure is imposed by requiring that the variance profile is given by

sxy= 1 Wdf

x−y W

with X

x

sxy =X

y

sxy= 1, (1.1) {fxy}{fxy}

for some non-negative symmetric functionsfthat decays sufficiently fast at infinity. AsW varies, the random band matrices naturally interpolate between two classes of quantum systems: the random Schr¨odinger operator with short range transitions such as the Anderson model [1], and mean-field random matrices such as Wigner matrices [36]. A basic conjecture about random band matrices is that a sharp Anderson metal-insulator phase transitionoccurs at some critical band width Wc. More precisely, the eigenvectors of band matrices satisfy a localization-delocalization transition in the bulk of the spectrum [7, 8, 37], with a corresponding sharp transition for the eigenvalues distribution [21]:

• for W Wc, delocalization of eigenstates (i.e. conductor phase) and Gaussian orthogonal/unitary ensemble (GOE/GUE) spectral statistics hold;

• forW Wc, localization of eigenstates (i.e. insulator phase) holds and the eigenvalues converge to a Poisson point process.

Based on numerics [7, 8] and nonrigorous supersymmetric calculations [22], the transition is conjectured to occur at Wc ∼√

N in d = 1 dimension. In higher dimensions, the critical band width is expected to be Wc∼√

logN ind= 2 andWc= O(1) ind>3. For more details about the conjectures, we refer the reader to [27, 34, 35]. The above features make random band matrices particularly attractive from the physical point of view as a model to study large quantum systems of high complexity.

So far, there have been many partial results concerning the localization-delocalization conjecture for band matrices. Ind= 1 and for general distribution of the matrix entries, localization of eigenvectors was first proved forW N1/8 [27], and later improved toW N1/7for band matrices with Gaussian entries [25].

The Green’s function was controlled down to the scale ImzW−d in [15, 18], implying a lower bound of orderW for the localization length of all eigenvectors. For 1D random band matrices with general distributed entries, theweakdelocalization of eigenvectors in someaveraged sense(see the definition in Theorem 2.7) was proved underW N6/7 in [11],W N4/5 in [14], and W N7/9 in [23]. The strong delocalization and bulk universality for 1D random band matrices was first rigorously proved in [4] for W = Ω(N), which was later improved toW N3/4recently in [6]. We mention also that at the edge of the spectrum, the transition of the eigenvalue statistics for 1D band matrices at the critical band width Wc ∼N5/6 was understood in [33], thanks to the method of moments. For a special class of random band matrices, whose entries are Gaussian with some specific covariance profile, some powerful supersymmetry techniques can be used (see [10, 34] for overviews). With this method, precise estimates on the density of states [9] were first obtained for d= 3. Then random matrix local spectral statistics were proved forW = Ω(N) [31], and delocalization was obtained for all eigenvectors when W N6/7 and the first four moments of the matrix entries match the

(3)

Gaussian ones [2] (these results assume complex entries and hold in part of the bulk). Moreover, a transition at the critical band width Wc ∼ N1/2 was proved in [28–30, 32], concerning the second order correlation correlation function of bulk eigenvalues.

The purpose of this paper is two-fold. First, we shall prove a precise fluctuation averaging estimate on certain average of polynomials of the resolvent entries, which is used in [5] to establish the local law for the generalized resolvent of 1D random band matricesH. Second, using the fluctuation averaging estimate, we shall give a self-contained proof for the weak delocalization of random band matrices in dimensions d>2 under the assumption W Nd+22 . This kind of weak delocalization of bulk eigenvectors was proved under the assumptionsW Nd+66 in [11, 12],W N2d+2d+2 in [14], and W N2d+1d+1 in [23]. (In [14], the authors claimed they can prove the weak delocalization under the condition W Nd+44 , which turns out to be wrong as pointed out in [23].) One can see that our results strictly improve these previous result. Moreover, the exponent is closer to being sharp (Wc= O(N0) ford>3) whendincreases. We remark that our proof can be also applied to 1D band matrix and gives a weak delocalization of the eigenvectors underW N34, however it is strictly weaker than the result in [6], where thestrongdelocalization of the bulk eigenvectors was proved under the same assumption.

1.2 Averaging fluctuations. In this subsection, we explain why we need a good fluctuation averaging {sec_model0}

bound. Then in Section 1.3, we will describe briefly the main new ideas used in the proof of this bound. For the following discussion, we focus on the weak delocalization of bulk eigenvectors of random band matrices in dimensionsd>2. For the application to 1D random band matrices in [5, 6], we shall use a slightly more general model, which will be defined in Section 2.1.

Our basic tool is the resolvent (Green’s function) defined as

G(z) = (H−z)−1, z∈C+:={z∈C: Imz >0}. (1.2) {greens}{greens}

The Green’s function was shown to satisfy that for any fixedε >0, maxx,y |Gxy(z)−m(z)δxy|6 Wε

pWdη, z=E+ iη, (1.3) {intro_semi}{intro_semi}

with high probability for allη W−d in [15, 18] (see Theorem 2.18), wherem is the Stieltjes transform of Wigner’s semicircle law

m(z) := −z+√ z2−4

2 = 1

2π Z 2

−2

p4−ξ2

ξ−z dξ, z∈C+. (1.4) {msc}{msc}

The bound (1.3) already implies a lower bound of order W for the localization length, but is not strong enough to give the delocalization on any scale larger than W. The bound we need is that for any scale W lN,

maxy

η X

x:|x−y|6l

|Gxy(z)|2

= o(1) with high probability (1.5) {intro_aver}{intro_aver}

for some W−d η 1. To get an improvement over the estimate (1.3), as in [14, 23], we introduce the so-calledT-matrix, whose entries

Txy:=X

w

sxw|Gwy|2, sxw:=E|hxw|2, (1.6) {def:{def: T}T}

are local averages of|Gxy|2. The importance ofT lies in the following facts:

(i) by a self-consistent equation estimate (see Lemma 2.19), we can bound maxx,y |Gxy−mδxy|26Nεmax

x,y Txy (1.7) {intro_Tself}{intro_Tself}

with high probability for any small constant ε >0;

(ii) for any scalelW, we can bound X

x:|x−y|6l

|Gxy|26 X

x:|x−y|6l+O(W)

Txy. (1.8) {intro_Tself2}{intro_Tself2}

(4)

The key tool to estimate theT-matrix is a self-bounded equation forT (see (2.18)):

Txy= (1 + O(1))

|m|2S 1− |m|2S

xy

+X

w6=y

|m|2S 1− |m|2S

xw

|Gwy|2− |m|2Twy

, (1.9) {T_equation}{T_equation}

where S= (sxy) in the matrix of variances. One observation of [14] is that the behavior of T is essentially given by

Θ := |m|2S 1− |m|2S =

X

k=1

|m|2kSk, (1.10) {diffu}{diffu}

and the second term in (1.9) can be regarded an error under proper assumptions on W and η. By the translation invariance ofS in (1.1), Sk can be understood through a k-step random walkPk

i=1Xi on the torus{1,2,· · ·, N}d with single step distributionP(X1=y−x) =sxy. Also with

|m(z)|2= 1−αη+ O(η2), z=E+ iη, α:= 4/p 4−E2,

we only need to keep the terms withk= O(η−1) in (1.10). Then there is a natural threshold atηc=W2/N2. Forkη−1c , thek-step random walk is almost a random walk on the free spaceZd without boundary, and behaves diffusively by CLT. This consideration gives the following diffusion approximation ofT(see Appendix A):

Txy(z)≈Θxy(z)∼ 1

Wd+W2|x−y|d−2, for η= Imz>W2

N2. (1.11) {T_behav}{T_behav}

The main part of the proof is to estimate the error term in (1.9). It turns out that for x6= y, Txy

|m|2Ex|Gxy|2, whereExis the partial expectation with respect to thex-th row and column ofH. Hence the error term is approximately a sum over fluctuations: P

wΘxwQw|Gwy|2, where Qw := 1−Ew. The main difficulty is that|Gxy|2 and|Gx0y|2forx6=x0 are not independent; actually they are strongly correlated for η 1. Estimating the high moments of these sums requires an unwrapping of the hierarchical correlation structure among many resolvent entries. In [14] and [23], the authors adopted different strategies. For the proof in [14], a so-called fluctuation averaging mechanism in [13] was used. It relies on intricate resolvent expansions to explore the cancellation mechanism in sums of monomials of Gentries. In [23], however, the authors performed a careful analysis of the error term in Fourier space, where certaincumulant expansions are used. In this paper, we will follow the line of [13, 14] and prove a much finer fluctuation averaging estimate as we will outline below.

Letbxbe any sequence of deterministic coefficients of order O(1). Suppose we have some initial (rough) estimates on theGentries: for some constantδ >0 and deterministic parameters Φ and Γ, we have

max

x,y |Gxy−δxym|6Φ, max

y

X

x∈ZdN

|Gxy|2+|Gyx|2

2, W−d/26Φ6N−δ, Γ>1, (1.12) {initial}{initial}

with high probability. The state of the art fluctuation averaging estimate was proved in [13]:

X

x:x6=y

bx |Gxy|2− |m|2Txy

6Nε

Nd/2Φ2+NdΦ4

, η W−d, (1.13) {rough1}{rough1}

with high probability for any constantε >0. In this paper, we claim that a much stronger estimate holds:

with high probability,

X

x:x6=y

bx |Gxy|2− |m|2Txy

6Nε 1 + Γ2Φ2

, ηW−d. (1.14) {rough3}{rough3}

Recall that Γ2∼η−1by the following Ward’s identity for the resolvent entries:

X

x

|Gxy|2=X

x

|Gyx|2=ImGyy

η , z=E+ iη, (1.15) {Ward}{Ward}

which can be proved using the spectral decomposition ofG. Together with the initial input Φ =Wε(Wdη)−1/2 by (1.3), it is obvious that (1.14) is better than (1.13) by a factor ofWd/Nd. For 1D random band matrices,

(5)

the gaining of theW/N factor is the most essential reason why one can reduce the conditionW N4/5 in [14] to the conditionW N3/4in [5, 6]. Ford>2, (1.14) allows us to establish the weak delocalization of random band matrices under the assumptionW Nd+22 as we shall explain now.

With maxx,yΘxy = O(W−d) by (1.11), we obtain from (1.14) and (1.7) that with high probability, maxx,y |Gxy−mδxy|26Nε/2Txy6 Nε

Wd 1 +η−1Φ2

. (1.16) {fine1}{fine1}

If we assume W Nd+22 , thenWdη 1 forη =W2/N2 and the above bootstrapping estimate gives an improved estimate

maxx,y |Gxy−mδxy|26 Nε

Wd with high probability. (1.17) {intro_entry}{intro_entry}

It seems that this estimate is still not good enough to conclude (1.5). However, using (1.9) and (1.11), we can obtain that for some sequence of deterministic coefficientsbw of order O(1),

η X

x:|x−y|6l

Txy=η l2 W2

X

w

bwQw|Gwy|26 Nεl2

W2 η+ Φ2

= o(1),

if we takeη =W2/N2and Φ2=NεW−dby (1.17). This gives (1.5) by (1.8), which implies the delocalization on any scalelN.

{sec_idea}

1.3 Basic ideas.In this subsection, we discuss the basic ideas for the proof of (1.14). All the results and discussions in this paper apply to both real and complex band matrices. For simplicity of notations, we consider only the real symmetric case.

Our starting point is the same as the one in [13], that is, we try to bound the high moments of the left-hand side of (1.14): for any small constantε >0,

E

X

x:x6=y

bx |Gxy|2− |m|2Txy

p

6Nε Γ2Φ2+ 1p

, p∈2N.

Then a standard application of the Markov’s inequality turns the high moment estimates into a high proba- bility bound. Keeping track of all the correlations among a large product of resolvent entries with different indices is rather involved. For this purpose, a convenient graphical tool was developed in [13] to organize the calculation, where the indices are the vertices of the graphs and theGentries are represented by the edges between vertices. In this paper, we shall extend the arguments in [13] and develop a graphical representation with more structural details. Also as in [13], estimating the high moments requires an unwrapping of the hierarchical correlation structure among several resolvent entries, which will be performed using resolvent expansions in Lemma 5.4 and Lemma 6.1, such as

Gxy=G(α)xy +GGαy

Gαα , α /∈ {x, y}, or Gxy=−Gxx

X

α

HG(x)αy, x6=y. (1.18) {resol_examp}{resol_examp}

Here for anya∈ZdN,G(a)denotes the resolvent of the (Nd−1)×(Nd−1) minor ofH obtained by removing thea-th row and column. The resolvent expansions are represented by graph expansions, i.e. expanding a graph into a linear combination of several new graphs. For example, applying the first expansion in (1.18) to theGxy edge in a graph gives two new graphs, where one of them has theGxy edge replaced by the two edgesGand Gαy. For the second expansion in (1.18), we will create a new vertexαin the graph, which is in theW-neighborhood ofx.

We renamey to the star index ?on the left-hand side of (1.14). Then we rewrite the sum as X

x:x6=?

bx |Gx?|2− |m|2Tx?

= X

x:x6=?

bx Ex|Gx?|2− |m|2Tx?

+ X

x:x6=?

bxQx|Gx?|2. (1.19) {divide}{divide}

In the rest of this section, we focus on the second sum and discuss briefly the basic idea to prove the following moment estimate: for any small constantε >0 and fixedp∈2N,

E

X

x:x6=?

bxQx|Gx?|2

p

6Nε Γ2Φ2+ 1p

. (1.20) {intro_high}{intro_high}

(6)

The moments of the first sum on the right-hand side of (1.19) are in fact easier to bound. We remark that all the graphs used in this paper are rooted graphs, with the root representing the?index.

Comparing (1.13) and (1.14), one can notice that we essentially replace the NdΦ2 factor with the Γ2 factor in (1.14). The origin of these two factors is as following. In the high moments calculation, terms like

X

x

cxGxy1Gxy2 or X

x

cxGxy1Gxy2, cx= O(1), (1.21) {termsimple0}{termsimple0}

will appear in the expressions. The authors in [13] bounded them byNdΦ2, which is not good enough when we consider band matrices (although it is sharp for mean-field random matrices withW =N). Instead, we shall use the better estimate

X

x

|Gxy1Gxy2|.Γ2 (1.22) {termsimple}{termsimple}

by the second estimate in (1.12) and Cauchy-Schwarz inequality. This is the very origin of the Γ2 factor in (1.20). In the rest of this subsection, we discuss the main difficulties and our new ideas to resolve them. In particular, the graphical tool plays an essential role in our approach.

1.3.1 The nested property

{sec_nested}

In order to apply the bound Γ2to the expressions as in (1.22), the order of the summation is important. For example, using (1.22), we can bound the following sum as

X

x1,x2,x3

Gx1?Gx1x2Gx1x3G2x2?Gx2x3

=X

x2

|Gx2?|2X

x1

|Gx1?Gx1x2|X

x3

|Gx1x3Gx2x3| .Γ2X

x2

|Gx2?|2X

x1

|Gx1?Gx1x2|.Γ2X

x2

|Gx2?|26.

(1.23) {good_nest}{good_nest}

However, in some cases, we may not be able to find such a summation order to get enough number of Γ factors. For example, the following sum is also an average of the product of 6 resolvent entries, but we can only get

X

x1,x2,x3

|Gx1?Gx1x2Gx1x3Gx2?Gx2x3Gx3?|=X

x3

|Gx3?|X

x2

|Gx2?Gx2x3|X

x1

|Gx1?Gx1x2Gx1x3| .Γ2ΦX

x3

|Gx3?|X

x2

|Gx2?Gx2x3|.Γ4ΦX

x3

|Gx3?|.Γ5(Nd/2Φ),

(1.24) {bad_nest}{bad_nest}

using (1.22) and (1.12), where one Γ factor is replaced by a Nd/2Φ factor. (Note that we get the same bound if we sum over x2 or x3 first.) This example shows that in general, we are not guaranteed to get enough number of Γ factors in the high moment estimate if the indices of some expression do not satisfy the followingwell-nested property. Given an average of certain product of resolvent entries over free indices x1, . . . , xp, we shall say that these indices arewell-nestedif there exists a partial order ?≺xi1 ≺ · · · ≺xip such that for each 16k6p, there exist at least two resolvent entries that have pairs of indices (xik, xαk) and (xik, xβk) with xαk, xβk ≺ xik. Note that if the indices are well-nested, then one can sum according to the order xip → · · ·. . . → xi1 to get a Γ2p factor. In our proof, we always start with expressions with well-nested indices. However, after several resolvent expansions, it will be written as a linear combination of much more complicated averages of monomials of resolvent entries. It is often very hard to check that the indices in the new expressions are also well-nested. This is one of the main difficulties in our proof.

To resolve the above difficulty, we try to explore some property that guarantees well-nested summation indices and, at the same time, is robust under the resolvent expansions. In terms of the graphical language, the well-nested property of indices is translated into a structural property of the graphs, which we shall call the ordered nested property. Suppose we want to estimate thep-th moment in (1.20). After some (necessary) resolvent expansions, we will have graphs containing vertices{x1, . . . , xp, ?}. Roughly speaking, a graphG hasordered nested propertyif its vertices{x1, . . . , xp, ?} can be partially ordered in a way

?≺xi1≺xi2 ≺ · · · ≺xip (1.25) {order}{order}

such that each of the vertexxik, 16k6p, has at least two edges connecting to thepreceding atoms (here we say a precedes b if a ≺ b). For example, the left graph in Fig. 1 corresponding to (1.23) has ordered nested property, while the right graph in Fig. 1 corresponding to (1.24) does not.

(7)

x

1

x

3

STAR

x

2

x

1

x

3

STAR

x

2

Figure 1: The left graph represents (1.23) and satisfies the ordered nested property with the order?≺x2

x1≺x3. The right graph represents (1.24) and does not satisfy the ordered nested property. {fig1pdf}

Suppose a graph satisfies the ordered nested property with (1.25), then one can sum over the vertices according to the orderP

xi1

P

xi2

· · ·P

xip. If the graph contains 2p+sedges, then 2pof them will be used in the above sum to give a Γ2pfactor while the rest of thesedges will be bounded by Φs. However, the ordered nested property is hard to track under graph expansions, especially because the order of the vertices will change completely after each expansion. Fortunately, we find that the ordered nested property is implied by a stronger but more trackable structural property of graphs, which we shall call the independently path-connected (IPC) nested property. A graph G with vertices{x1, . . . , xp, ?} is said to satisfy the IPC nested property(or has theIPC nested structure) if for each vertex, there are at least 2 separated paths connecting it to ?, and the edges used in these 2p paths are all distinct. One can show with pigeonhole principle that a graph with IPC nested structure always satisfies the ordered nested property. For example, the graphs in Fig. 1 do not satisfy the IPC nested property. On the other hand, the graphs in Fig. 2 have IPC nested structures and one can see that the vertices can be ordered as?≺x2≺x1.

x

1

x

2

STAR

x

1

x

2

STAR

Figure 2: The left graph represents|Gx1?|2|Gx2?|2. We apply the first resolvent expansion in (1.18) toGx1?

and draw one of the new graphs on the right, where the Gx1? edge is replaced by two edges Gx1x2Gx2?,

which still constitute a path fromx1 to?. Here we omitted theG−1x2x2 factor in the second graph. {fig2pdf}

In the proof, we always start with graphs with IPC nested structures. The main reason we introduce this stronger concept is that compared with the ordered nested property, it is much easier to check that the IPC nested property is preserved under resolvent expansions. Here the IPC nested property is preserved in the sense that if the original graph has IPC nested structure, then all the new graphs appeared in the resolvent expansions also have IPC nested structures. This in fact follows from a simple observation that, in resolvent expansions, we always replace an edge between vertices, say, α and β with a path between the same two verticesαandβ. In particular, the path connectivity from any vertex to the? vertex is unchanged. Hence we are almost guaranteed to have the IPC nested property (which implies the ordered nested property) at each step of our proof. However, we need to be very careful during the proof since the graph operations other than resolvent expansions may break the IPC nested structure, and this brings a lot of technical difficulties to our proof as we will see in Section 1.3.3.

1.3.2 Two-level structures

{sec_2level}

In estimating the p-th moment in (1.20), the initial graph will contain p free indices, say {x1, . . . , xp}.

However, in some resolvent expansions, we will add new vertices to the new graphs, such as the new vertex αin the second expansion in (1.18). Moreover, these indices lie within theW-neighborhoods around the free

(8)

indices. Thus in general, we shall bound averages of products of the form

p

Y

i=1

Gα(i)

1 β(i)1 · · ·Gα(i) kiβ(i)

ki

, max

16k6ki

α(i)k −xi

= O(W), 16i6p,

up to the choice of the charges of the resolvent entries. (Here the charge of a resolvent entry indicates whether it is aGfactor or aGfactor.) Unfortunately, the introduction of new indices breaks the connected paths from the free vertices to the ? vertex. Hence we lose the IPC nested property of the free vertices {x1, . . . , xp}, which, as we discussed above, helps us to get enough number of Γ factors.

To handle this problem, we introduce the random variables (Ψxy)x,y∈Zd

N, see Definition 3.4. They are roughly defined as the localL2-averages of theGentries with indices within W-neighborhoods of (k, l):

xy|2:= X

max{|x0−x|,|y0−y|}6NτW

1

W2d |Gx0y0|2+|Gy0x0|2 ,

for small constantτ >0. It is easy to see that under (1.12),

xy|.NτΦ, X

x

xy|2.NΓ2, (1.26) {Psi1}{Psi1}

with high probability. The importance of the Ψ variables is that they provide local uniform bounds on the off-diagonalGentries, i.e., for any free verticesxi andxj,

max

max{|α−xi|,|β−xj|}6(logN)CW

1(α6=β)|Gαβ|6Ψxixj (1.27) {molecule1}{molecule1}

with high probability. This follows from a standard large deviation estimate, see the proof for (3.15). It then motivates us to organize the graphs according to certain subclasses of vertices. More specifically, we shall call the indicesatoms, where the?index is called the?atom and the free indices{x1, . . . , xp}are called free atoms. We then group each free atom xi and the atoms within itsW-neighborhood into a subclass called molecule, denoted by [xi]. (More precisely, an atomαbelongs to the molecule [xi] only ifαcan only take values subject to the condition|α−xi|6NτW. Note that even if an atomβisnotin the molecule [xi],some of its values can still lie in theW-neighborhood ofxi.) Here we are using the words “atom” and “molecule”

in a figurative way. We now have a two-level structures for a particular graph, that is, the structure on the atomic level and the one on the molecular level (i.e., on the graph where each molecule is regarded as one vertex). We have the following simple observations:

• although the graphs can keep expanding with new atoms added in, the graphs on the molecular level are always simple with the? atom andpmolecules [xi],i= 1, . . . , p;

• by (1.27), for all the off-diagonal edges with one end in molecule [xi] and one end in molecule [xj], they can be bounded by the same Ψxixj variable;

• the path connectivity from any molecule to the ? vertex on the molecular level is preserved under resolvent expansions (since in each expansion, we replace some edge between atoms, say,αandβ, with a new path between two atoms in the same molecules as αandβ).

These facts together with (1.26) make the molecular graphs and the Ψ variables particularly suitable for defining the IPC nested property. That is, for a general graph, we say it satisfies the IPC nested property if the molecular graph with vertices [xi], i= 1, . . . , p, has this property. For this reason, we shall say that the IPC nested structure is aninter-molecule structure. For example, the molecular graph in Fig. 3 satisfies the IPC nested property. Now following the arguments in Section 1.3.1, as long as we keep the IPC nested structure of the molecular graphs, we can bound the inter-molecule edges by Ψ variables, sum over the free indices according to the nested order, and apply the second bound in (1.26) to get the desired factor Γ2p in thep-th moment estimate.

Given the above definition, it is easy to check that the IPC nested property on the molecular graphs are preserved under resolvent expansions. Moreover, the above view of point of “two-level structure” will also facilitate our following proof. In fact, besides the Γ2 factors from the IPC nested structure, we still need to extract enough number of Φ factors. Roughly speaking, we will adopt the idea in [13], which has led to the

(9)

x

1

STAR

x

2

x

1

α

STAR

x

2

β β

x

2 α

β

[x

1]

STAR

[x

2]

Figure 3: Given a graph with two free atomsx1, x2and an atomβin theW-neighborhood ofx2, we perform the second resolvent expansion in (1.18) to the edgeGx2? and get the middle graph, where we create a new atomαin theW-neighborhood ofx2. We groupx2,αandβ into a single molecule [x2] :={x2, α, β}, i.e. the part inside the pentagon. The middle graph has a two-level structure, where we draw the molecular graph with molecules [x1] and [x2] on the top, and the structure inside the molecule [x2] (i.e. the inner-molecule structure) on the bottom. Again we have omitted some details in the graphs, such as the Gxx and H

factors. {fig3pdf}

two extra Φ factors in (1.13) besides the factorNdΦ2. The approach in [13] allows one to divide the graph into smaller subgraphs and bound each part separately. This is possible because only the total number of off-diagonal edges (i.e. the Φ factors) in the graph matters. But the same approach cannot be applied to our proof, because we need to maintain the IPC nested structure of the graph as a whole. As a result, some manipulations of the graphs in [13] that can destroy the IPC nested structure are not allowed. Instead, we shall organize our proof according to the two-level structure: theinter-molecule structure, and the inner- molecule structures (i.e. the subgraphs inside the molecules). In the proof, the inter-molecule structure are only allowed to be changed through resolvent expansions, since we need to keep the IPC nested property.

We will show that the inter-molecule structures of the graphs only provide a Γ2Φ factor in (1.14). On the other hand, the rest of the Φ factor will come from graph operations which may change the inner-molecule structures but preserve the IPC nested structures. This will be discussed in detail in next section.

1.3.3 The role ofQx’s

{sec_colors}

In this subsection, we discuss the basics idea to obtain the Φ2 factor. So far in the discussion, we have ignored theQx’s in (1.20). In fact, to bound the left-hand side of (1.20), we need to estimate averages of the following form

E X

x

cxG(x), x:= (x1, . . . , xp), cx= O(1), G(x) :=

p

Y

i=1

Qxi(Gxi), (1.28) {color_aver}{color_aver}

whereGxi denotes the part of the expression obtained from the resolvent expansions of|Gxi?|2. We will use colors to represent the Qx’s in graphs, i.e. we associate to all the components in Gxi a color called “Qxi”.

To avoid ambiguity in the graphical expressions, we require that every component of the graph has a unique color, in the sense that every component belongs to at most oneQxgroup. In Fig. 4, we give an example of a colorful graph.

The idea of using averaging over Qx(·) terms to get an extra Φ factor is central in [20] and subsequently used in the proofs of fluctuation averaging results of many other works, e.g. [15,17,19,26]. In these papers, the authors studied the specific quantityP

xbxQx(G−1xx), but we can apply the same idea toP

x6=?bxQx|Gx?|2. Roughly speaking, we can write the expectation of the product inG(k) as

E[AiQxi(Gxi)] =E

hAi− A(xi i)

Qxi(Gxi)i ,

where Ai is the expression outside Qxi, A(xi i) is any expression that is independent of the xi-th row and column of H, and we have used ExiQxi(·) = 0 for the equality. It turns out that if Ai does not contain

(10)

x

1

x

2

STAR

Qk2 Qk1

x

1

x

2

STAR

Figure 4: We color the graphs in Fig. 2 with colorsQx1 (red) andQx2(green). The left graph now represents Qx1(|Gx1?|2)Qx2(|Gx2?|2), and the right graph represents Qx1(Gx1x2Gx2?G−1x

2x2G(xx12?))Qx2 |Gx2?|2 , where we did not draw theG−1x

2x2 factor. {fig4pdf}

thexi index, then it is weakly correlated with thexi-th row and column ofH, and we can choseA(xi i)such that the typical size of (Ai− A(xi i)) is smaller than Ai by a Φ factor. If Ai contains the xi atom, then it already contains sufficiently many off-diagonal edges, i.e. Φ factors, as we need. We can perform the above operations to all the free indicesxi, 16i6p, and obtain an extra Φp factor. As an example, forp= 2 and x16=x2, we can use the first resolvent expansion in (1.18) to write

E Qx1|Gx1?|2

Qx2|Gx2?|2

(1.29) {fluc example}{fluc example}

=EQx1

"

G(xx12?)+Gx1x2Gx2?

Gx2x2 G(xx12?)+Gx1x2Gx2?

Gx2x2 #

Qx2|Gx2?|2

. (1.30)

Thus for Γ2=Qx1|Gx1?|2, we can choose Γ(x22)=Qx1|G(xx12?)|2 such that (A2− A(x22)) contains at least one more off-diagonal edge of order Φ (see the right graph of Fig. 4). In the actual proof, instead of using the free indices, we will use the concept of free molecules, but the main ideas are the same.

The origin of the second Φ factor is more subtle, and was first identified in [13]. Roughly speaking, it comes from averages of the following form in (1.28):

X

α

bαGαβ1Gαβ2, bα= O(W−d)1(|α−xi|.W), (1.31) {charged_intro}{charged_intro}

where β1,2 are atoms outside the molecule [xi]. The key observaton is that Gαβ1Gαβ2 satisfies the self- consistent equation

Gαβ1Gαβ2 =X

γ

(1−m2S)−1αγ[Qγ(Gγβ1Gγβ2) +Eγ], (1.32) {charged_intro1}{charged_intro1}

where Eγ denotes the error term for each γ, and it is smaller than the main term by a Φ factor. For the main terms, we get an average of the form

X

γ

c0γQγ(Gγβ1Gγβ2), c0γ = O(W−d)1(|γ−xi|.W), (1.33) {charged_intro2}{charged_intro2}

which leads to another Φ factor by the argument in the previous paragraph. The above argument works due to the following reasons:

(1) the entries (1−m2S)−1αγ are negligible for|α−γ|>(logN)2W (see (2.15)), so we can include γ into the molecule [xi] such that the IPC nested structure of the graph is unchanged after replacing (1.31) with (1.33);

(2) replacing (1.31) with theEγ part also preserves the IPC nested structure;

(3) each free molecule [xi] contains at least one atomαthat is connected with two edges of the form (1.31).

Here (1) and (2) ensure the IPC nested structure of the new graphs, and (3) shows that we can get enough Φ factors from the free molecules. However, we still have the following technical issues, which make the above argument to be the trickiest part of our proof.

(11)

(i) We always start with a colorful graph. However, for the above arguments to work, the two edges Gαβ1Gαβ2 need to be colorless. Thus we first need to remove all the colors (i.e. the Qx’s) from the graphs, i.e. write a colorful graph into a linear combination of colorless graphs.

(ii) The atomαconnected with the two edgesGαβ1Gαβ2 may be also connected with other edges. Thus we need to perform some operations to get a new graph which contains a (possibly different) atomα0 that is connected with only two edgesGα0β1Gα0β2 and is in the same molecule asα. We shall call such an atom a simple charged atom.

(iii) The simple charged atoms in different molecule may share edges. Hence we have to handle them one by one, not as a whole. Moreover, each time we apply the previous argument from (1.31) to (1.33), we need to repeat the processes in (i) and (ii) again.

It is worth mentioning that the operations in (i) and (ii), although can be very complicated, are easy to check to preserve the IPC nested structures of the graphs.

Finally, we remark that the above argument cannot be applied to terms of the formGαβ1Gαβ2, since the (1−m2S)−1in (1.32) is well-behaved due to the nonzero imaginary parts ofm, while (1− |m|2S)−1 in the case ofGαβ1Gαβ2 is not since |m|= 1−O(η).

1.3.4 Summary of the proof

Following the above discussions, our main proof consists of the following four steps.

Step 0: Develop a graphical tool which extends the previous ones used in e.g. [13, 15]. This is the content of Section 5.1 and Section 6.1.

Step 1: Starting with the graphs in the high moment calculation, we perform graph expansions, identify the IPC nested structures and obtain the first Φ factor. This is the content of Sections 5.2-5.3 and Section 6.2. This step, although contains the main new ideas of this paper as discussed in Sections 1.3.1 and 1.3.2, is actually the relatively easier step of our proof.

Step 2: Remove the colors as discussed in the above item (i). This is the content of Section 6.3.

Step 3: Create simple charged atoms as discussed in the above item (ii). This is the content of Section 6.4.

Step 4: Deal with simple charged atoms using (1.32). This is the content of Section 6.5.

The rest of this paper is organized as follows. In Section 2, we introduce our model and present the main result—Theorem 2.11, whose proof is mainly based on two averaging fluctuation lemmas—Lemma 2.14 (for the first term on the right-hand side of (1.19)) and Lemma 2.15 (for the second term on the right-hand side of (1.19)). Then using Theorem 2.11, we prove the weak delocalization of random band matrices in dimensions d > 2 in Section 2.2.1. In Section 3, we introduce the notations and collect some tools that will be used in the proof of Lemma 2.14 and Lemma 2.15. In Section 4, we reduce Lemma 2.14 to another averaging fluctuation lemma, i.e. Lemma 4.3, which has a similar form as Lemma 2.15. The Sections 5 and 6 consist of the above Steps 0-4 of the main proof.

Conventions. The fundamental large parameter isN and we regardW as a parameter depending onN. All quantities that are not explicitly constant may depend onN, and we usually omitN from our notations.

We useCto denote a generic large positive constant, which may depend on fixed parameters and whose value may change from one line to the next. Similarly, we usec,εorδto denote a generic small positive constant.

If a constant depend on a quantitya, we useCa orca to indicate this dependence. Also, in the lemmas and theorems of this paper, we often use notationsτ, D when we want to state that the conclusions hold forany fixed small constant τ >0 and large constant D > 0. For two quantities AN and BN > 0 depending on N, we use the notationsAN = O(BN) andAN ∼BN to mean |AN|6CBN and C−1BN 6|AN|6CBN, respectively, for some constant C > 0. We use AN = o(BN) to mean |AN| 6 cNBN for some positive sequencecN ↓0 asN → ∞. For any matrix A, we use the notations

kAk:=kAkl2→l2, kAkmax:= max

i,j |Aij|, kAkmin:= min

i,j |Aij|.

In particular, for a vectorv, we shall also use the notation kvk≡ kvkmax.

Acknowledgements. The second author would like to thank Benedek Valk´o and L. Fu for fruitful discus- sions and valuable suggestions.

(12)

2 Main results

{sec_main}

{sec_model}

2.1 The model.All the results in this paper apply to both real symmetric and complex Hermitian random band matrices. For the definiteness of notations, we only consider the real symmetric case. We always assume thatN, W are integers satisfying

Nc6W 6N (2.1) {WN}{WN}

for some constantc > 0. Moreover, all the statements in this paper only hold for sufficiently large N and we will not repeat it everywhere.

We define thed-dimensional discrete torus

ZdN := (Z∩(−N/2, N/2])d.

Thed-dimensional random band matrix is indexed by the lattice points, where we fix an arbitrary ordering ofZdN. For anyx∈Zd, we always identify it with its canonical representative

[x]N := (x+NZd)∩ZdN. (2.2) {cani}{cani}

Moreover, for simplicity, we will always use thel norm onZdN lattice:

|x−y| ≡ |[x−y]N|:= max

16i6d|[xi−yi]N|, x, y∈ZdN. (2.3) {simplei}{simplei}

Keeping the application of our results to [5, 6] in mind, we shall use a slightly more general model than the one in the introduction. LetH = (Hxy) be anNd×Ndreal symmetric random matrix with centered matrix entries that are independent up to the symmetry constraint. We assume that that variancesE|Hxy|2=sxy

satisfy

cs

Wd ·1|x−y|6csW 6sxy6 Cs

Wd ·1|x−y|6CsW, x, y∈ZdN, (2.4) {bandcw1}{bandcw1}

for some constantscs, Cs >0. Then H is a random band matrix with band width of order W. Moreover, up to a rescaling, we assume that

1−ζ6 X

y∈ZdN

sxy61 +ζ, x∈ZdN, (2.5) {bandcw2}{bandcw2}

for someζ∈[0,1).

{jyyuan}

Assumption 2.1 (Band matrixHN). Let H ≡HN be an Nd×Nd real symmetric random matrix whose entries(Hxy:x, y∈ZdN)are independent random variables satisfying

EHxy= 0, E|Hxy|2=sxy, x, y∈ZdN, (2.6) {bandcw0}{bandcw0}

where the variancessxy satisfy (2.4)and (2.5). Then we say thatH is a random band matrix with (typical) bandwidthW ≡WN. Moreover, we define theNd×Ndsymmetric matrix of variancesS≡Sζ := (sxy)x,y∈Zd

N. We assume that the random variablesHxy have arbitrarily high moments, in the sense that for any fixed p∈N, there is a constantµp>0 such that

(E|Hxy|p)1/pps1/2xy , x, y∈ZdN, (2.7) {high_moment}{high_moment}

for all N. Our result in this paper will depend on the parameters Cs and µp, but we will not track the dependence on these parameters in the proof.

An important type of band matrices satisfying the above assumptions is the periodic random band matrices studied in e.g. [13–15], where the variances are given by (1.1).

{jyyuan2}

Assumption 2.2(Periodic band matrixHN). We say thatH≡HN is a periodic random band matrix with bandwidthW ≡WN if it satisfies the Assumption 2.1 and (1.1).

Again for the applications in [5, 6], we state our results for the followinggeneralized resolvent(orgener- alized Green’s function) ofH.

(13)

Definition 2.3 (Generalized resolvent). Given a sequence of spectral parameters zx ∈ C+, x ∈ ZdN, we define the following generalized resolvent (or generalized Green’s function)G(H, Z) as

G(H, Z) = (H−Z)−1, Zxyxyzx. (2.8) {defGzetag}{defGzetag}

If zx = z for all x, then we get the normal Green’s function G(z) as in (1.2). The key point of the generalized resolvent is the freedom to choose differentzx. In particular, the following choice ofZ is used in [5, 6] for 1D random band matrices:

zx=z·116x6W + ˜z·1x>W

for somez,˜z∈C+ with Im ˜z <Imz.

For zx’s with fixed imaginary parts, one can show that (Gxx(H, Z))x∈Zd

N satisfies asymptotically the following system of self-consistent equations for (Mx)x∈Zd

N ≡(Mx(S, Z))x∈Zd N: Mx−1=−zx− X

y∈ZdN

sxyMy. (2.9) {falvww}{falvww}

If ζ is small and the zx’s are close to some z ∈ C+, then the above equations are perturbations of the self-consistent equation form(z) defined in (1.4):

m−1(z) =−z−m(z).

In particular, the following Lemma 2.4 shows that the solution (Mx)x∈Zd

N exists and is unique as long asζ and maxx|zx−z| are small enough. It is proved in Lemma 1.3 of [5].

{UE}

Lemma 2.4. Supposez∈C+ satisfies|Rez|62−κand|z|6κ−1 for some (small) constantκ >0. Then there exist constants c0, C0>0 such that the following statements hold.

• (Existence) If

ζ+ max

x |zx−z|6c0, (2.10) {heiz}{heiz}

then there exists (Mx(S, Z))x∈Zd

N that solves (2.9)and satisfies maxx |Mx(S, Z)−m(z)|6C0

ζ+ max

x |zx−z|

. (2.11) {bony}{bony}

• (Uniqueness) The solution(Mx(S, Z))x∈Zd

N is unique under (2.10)and the condition

maxx |Mx(S, Z)−m(z)|6c0. (2.12) {heiz2}{heiz2}

for parameters satisfying (2.10),

In the rest of this paper, we always assume that (2.10) holds for sufficiently small c0>0. In particular, forz∈C+ with|Rez|62−κand|z|6κ−1, we have Imm(z)>c for some constantc >0 depending on κ. Thus we can choosec0to be small enough such that

ImMx(S, Z)>c/2, 16i6N. (2.13) {Im_lowbound}{Im_lowbound}

Let M ≡ M(S, Z) denote the diagonal matrix with entries Mxy := Mxδxy. With (2.13), we can get the following lemma, which was proved as Lemma 2.7 in [5].

{inversebound}

Lemma 2.5. Supposez∈C+satisfies|Rez|62−κand|z|6κ−1for some (small) constantκ >0. Suppose (2.10) and (2.11) hold for some small enough constant c0 >0. Then there exist constants c1, C1 >0 such that

(1−M2S)−1

l→l < C1, (2.14) {gbzz2}{gbzz2}

and

(1−M2S)−1

xy−δxy

6

(C1W−d, if |x−y|6(logN)2W

N−c1logN, if |x−y|>(logN)2W . (2.15) {tianYz}{tianYz}

Références

Documents relatifs

Keywords: random matrices, norm of random matrices, approximation of co- variance matrices, compressed sensing, restricted isometry property, log-concave random vectors,

– Studying weak solutions of Burgers’ equation with finite entropy dissipation we show the sharpness of recent results of Jabin and Perthame on velocity averaging.. Similar

Such a result is of constant use in the study of centered random matrices, as it allows to describe the behavior of the Stieltjes transform associated to the spectral measure

GUE, GOE, GSE for the case of Hermitian, symmetric, and, respectively, quaternion self-dual Wigner matrices; and the ensembles of real or complex sample covariance matrices

3 Research partially supported by a grant from the Israel Science Foundation, administered by the Israel Academy of Sciences... These large deviations are essential tools in the

to the a-field E’. Averages of Essentially Bounded Functions. We derive below the structure of such operators, as far as it can be described in

In this paper, we only consider the averaged Hamiltonian at first order in the planetary masses. More precisely, we prove quantitatively that there.. 2 ) 2 , (13) while the

These results apply to the fluctuations of Brownian motion, Birkhoff averages on hyperbolic dynamics, as well as branching random walks.. Also, they lead to new insights into