HAL Id: hal-01337957
https://hal.archives-ouvertes.fr/hal-01337957
Preprint submitted on 27 Jun 2016
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
An Optimal Schwarz Preconditioner for a Class of Parallel Adaptive Finite Elements
Sébastien Loisel, Hieu Nguyen
To cite this version:
Sébastien Loisel, Hieu Nguyen. An Optimal Schwarz Preconditioner for a Class of Parallel Adaptive
Finite Elements. 2016. �hal-01337957�
An Optimal Schwarz Preconditioner for a Class of Parallel Adaptive Finite Elements
S´ebastien Loisel
a, Hieu Nguyen
a,1,∗aDepartment of Mathematics, Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS, United Kingdom
Abstract
A Schwarz-type preconditioner is formulated for a class of parallel adaptive finite elements where the local meshes cover the whole domain. With this preconditioner, the convergence rate of Krylov methods is shown to depend only on the ratio of the second largest and smallest eigenvalues of the preconditioned system. These eigenvalues can be bounded independently of the mesh sizes and the number of subdomains, which proves the proposed preconditioner is optimal. Numerical results are provided to support the theoretical findings.
Keywords: Domain decomposition, preconditioner, Bank-Holst paradigm, two-grid discretizations, parallel adaptivity
2010 MSC: 65N55, 65N22, 65F08
1. Introduction
Adaptive finite element method (AFEM) has been a very popular method for solving partial di ff erential equations in science and engineering [2]. AFEM automatically refines or coarsens meshes to adapt to the computed solutions, thus o ff ering great reliability, robustness and e ffi ciency. Recently, there has been a great demand to use AFEM on parallel distributed supercomputers with many processors to tackle large-scale problems. In order to improve the scalability of AFEM on supercomputers, it is usually combined with a domain decomposition method (DDM). In DDM, the domain is partitioned into a number of subdomains and smaller problems on these subdomains are solved in parallel to determine the overall solution [30, 34].
Combining AFEM with DDM, however, introduces challenges that are not present in the traditional version of AFEM. One of the notable challenges is that AFEM builds its meshes gradually and global or near-neighbour infor- mation is usually needed. The information can be approximated solutions, error estimates on intermediate meshes or mesh information utilised in adaptive meshing procedures. Since communication costs are high on distributed supercomputers, one wants to avoid communicating as much as possible. This can be achieved when each processor has a mesh of the whole domain and its adaptive enrichment is performed almost independently with those of other processors. In general, the adaptive enrichment on each processor focus mainly on its subdomain. Consequently, after the adaptive enrichment phase, each processor has a composite mesh of the whole domain, which is fine in its subdomain and much coarser elsewhere. The final global mesh is the union of the refined submesh provided by each processor. Figure 1 shows an example of the meshes before and after adaptive enrichment, and the final global mesh.
The initial idea of using local meshes of the whole domain was first introduced by Mitchell for a parallel multigrid method [28]. Then it was further developed into parallel adaptive algorithms. The notable ones include the Bank- Holst algorithm [10, 11] and the local and parallel algorithms based on two-grid discretizations [38, 39, 24]. Several
∗Corresponding author
Email addresses:[email protected](S´ebastien Loisel),[email protected](Hieu Nguyen )
1Current address: CIMNE - Centre Internacional de Metodes Numerics en Enginyeria, Universitat Politecnica de Catalunya, Barcelona, Spain
Figure 1: A coarse mesh with its partition (left), a local mesh on a processor after adaptive enrichment (middle), and the global fine mesh.
variants of these algorithms are studied in [12, 8, 36, 19, 40]. The two algorithms and their variants have been demonstrated to work well for many problems in both science and engineering [10, 29, 5, 6, 11, 3, 4, 36, 15, 33, 17, 32].
Different components contribute to their success. For discussions on how to obtain a suitable partition, where each subdomain contributes roughly the same amount of error, we refer to [10, 11]. For how to regularise the local meshes to make the global fine mesh conforming, we refer the readers to [16]. In this paper, we focus on solving the final global linear system. There is no restriction in the type of solvers can be used. However, it would be ideal if the solver can take advantage of the special formulation of the algorithms. In [14], Bank and Lu developed a dedicated domain decomposition solver for the Bank-Holst algorithm. The solver is empirically shown to be robust and e ffi cient for many problems [11, 14, 15, 17]. However, its theoretical convergence can only be fully analysed for a special case where the global interface system is completely presented on all processors [18]. For this to happen, all elements attached to the interface, including ones that are far away from the considered subdomain, are required to be refined to the same level of the corresponding elements in the global fine mesh. In addition, the global iteration matrix of the solver is not symmetric, even if all of the local matrices are symmetric. Consequently, conjugate gradient acceleration can not be used.
In this paper, we propose a novel Additive Schwarz (AS) preconditioner that can be combined with Krylov meth- ods, such as CG, to e ffi ciently solve the global linear system in these parallel adaptive algorithms. Our preconditioner is formulated using the local meshes after adaptive enrichment. We recall that these are meshes of the whole domain.
They are fine and identical with the global fine mesh in their corresponding subdomains, but generally much coarser elsewhere. If the adaptive meshes are nested, all the finite element spaces associated with the local meshes contain the coarse space associated with the starting coarse mesh. Therefore, there is no need to explicitly add a coarse space as in the traditional two-level AS. However, having the coarse space contained in every subspace introduces the number of subdomains as the largest eigenvalue, which might damages the scability of the preconditioner. Fortunately, we can show that this largest eigenvalue is isolated and the convergence rate of the CG method can be bounded by a quantity that depends only on the ratio of the second largest eigenvalue and the smallest eigenvalue. The ratio is called the e ff ective condition number. Our main theoretical results lies in the analysis of these eigenvalues.
The estimate for the second largest eigenvalue is obtained by establishing a comparison to the largest eigenvalue in a related AS method. Our estimate takes advantage of the strengthened Cauchy-Schwarz inequality for the hierarchical decomposition of local subspaces into a low frequency component and a high frequency component. For estimating the smallest eigenvalue, we follow the subspace correction framework proposed by Xu [37] and prove the existence of a stable decomposition associated with the local meshes. Since these meshes are generally very di ff erent from one another and with the global fine mesh outside of their associated subdomains, the classical analysis of AS method (cf.
[21, 34]) does not apply. Our analysis requires new sophisticated interpolation operators based on the work of Scott and Zhang [31]. These operators are defined in conjunction with a colouring scheme in order to construct the stable decomposition recursively.
In case exact solvers are employed on all local subspaces, our analysis of the eigenvalues shows that the e ff ective condition number of the preconditioned system does not depend on the coarse mesh size H, the fine mesh size h and the
2
number of subdomains N; thus our method is optimal. Roughly speaking, the proposed method performs comparable to a traditional AS method with an extremely thick overlap (δ ≈ H). With proper programming, it delivers superior rate of convergence while demanding about the same amount of computation as with traditional AS methods with a small overlap.
In some aspects, our result is related to the work of Bank et al. [13]. However, our preconditioner is very di ff erent as we use local subspaces associated with meshes of the whole domain and there is no explicit coarse component.
The rest of this paper is organised as follows. We first state the model problem and introduce key notations in section 2. The formulation of the preconditioner is presented in section 3. The analysis of the convergence of the CG method applied to the preconditioned system, as well as the estimates for the second largest and smallest eigenvalue are carried out in section 4. In section 5, we present some numerical experiments to verify our theoretical results.
2. Preliminaries
For simplicity of exposition, we confine our discussions to Poisson’s equation with homogeneous Dirichlet con- dition:
− ∆ u(x) = f (x) in Ω ,
u(x) = 0 on ∂ Ω . (1)
Here Ω is a bounded domain with polygonal boundary in R
d, d = 2, 3.
Let { Ω
i}
Ni=1
be the subdomains in the partition of Ω . We assume that this is a non-overlapping partition, namely Ω = ¯ ∪
Ni=1
Ω ¯
iand Ω
i∩ Ω
j= ∅ if i , j.
In this study, we will use several finite element meshes. The mesh T
Hof size H will be the shape regular and conforming coarse mesh provided to each processor at the beginning. We further assume that each Ω
iis a union of elements in T
H. The meshes T
i, 1 ≤ i ≤ N are local meshes on each processor at the end of the adaptive enrichment phase . They are meshes of the whole domain which are fine with elements of size h H within Ω
i, but coarser and largely coincide with T
Helsewhere. The mesh T
iis required to be conforming inside ¯ Ω
i. However, it can have hanging nodes outside of ¯ Ω
i. In addition, we assume that T
iare aligned along their fine interface, namely if Ω
iand Ω
jare neighbouring subdomains then T
iand T
jare matched along the part of interface sharing between Ω
iand Ω
j.
Denote T
hthe union of T
irestricted on ¯ Ω
i: T
h= ∪
iN=1(T
i|
Ω¯i). This mesh is the globally refined, shape regular and conforming mesh of size h of Ω . We assume the following nesting property holds
T
H⊂ T
i⊂ T
h, for 1 ≤ i ≤ N.
Now, we extend each Ω
ito a larger region Ω
†iso that all elements of T
ithat are outside of Ω
†ibelong to T
H(i.e.
there is no refinement in T
ioutside of Ω
†i). We also require that ∂ Ω
†idoes not cut through any elements in T
hor any elements in T
i. The extension can be obtained by repeatedly adding to Ω
ilayers of elements in T
i. Since the adaptive meshing on processor i mainly focuses on the inside of the subdomain Ω
i, we can assume that only few layers of elements in T
Houtside of Ω
iget refined in creating T
i. More specifically, we assume that the width of the regions Ω
†i\ Ω
iare of size H (in case there is barely any refinement outside Ω
i, some elements in T
H|
Ωci
might need to be included in Ω
†i). Figure 2 shows an example of a subdomain Ω
iand its extension Ω
†i. Lastly, we assume that the (overlapping) partition { Ω
†i}
Ni=1
of Ω can be coloured using at most N
ccolours, in such a way that if Ω
†iand Ω
†jare of the same colour and i is di ff erent from j, then Ω
†i∩ Ω
†j= ∅.
Let V
0, V
i, and V
hbe the linear finite element spaces (of piecewise linear polynomials) associated with T
H, T
iand T
hrespectively, i.e.
V
0= {u
H(x) ∈ H
01( Ω )| u
H(x)|
T∈ P
1(T ), ∀T ∈ T
H}, V
i= {u
h(x) ∈ H
01( Ω )| u
h(x)|
T∈ P
1(T ), ∀T ∈ T
i}, V
h= {u
h(x) ∈ H
01( Ω )| u
h(x)|
T∈ P
1(T ), ∀T ∈ T
h}.
where P
1(T ) is the set of linear polynomials defined on element T .
Figure 2: SubdomainΩi(left) and its extensionΩ†i (right) on their associated local meshTi.
Also let {ψ
j(x)}
nj=1and {ψ
(i)j(x)}
nji=1be the sets of linear nodal basis function associated with T
hand T
i, i = 0, 1, . . . , N. Correspondingly, denote {x
j}
nj=1and {x
(i)j}
nj=i1be the sets of nodal points of T
hand T
i, i = 0, 1, . . . , N.
Here, for convenience, we use T
0to refer to T
H.
The finite element approximation u
h(x) ∈ V
hof u(x) is the solution of the following problem: find u
h(x) ∈ V
hsuch that
a(u
h, v
h) = Z
Ω
f (x) v
h(x) dx, for all v
h(x) ∈ V
h, (2) where a(u
h, v
h) = R
Ω
(∇u
h· ∇v
h)dx.
For u
h(x) ∈ V
h, denote u ∈ R
nits coordinate vector, i.e., u
h(x) = P
nj=1
u( j) ψ
j(x). Then the problem (2) becomes
Au = f , (3)
where A ∈ R
n×n, A(k, j) = a(ψ
j, ψ
k), and f ∈ R
n, f (k) = R
Ω
f (x) ψ
k(x) dx. Clearly, A is symmetric positive definite and a(u
h, v
h) = v
TAu =
..(u, v)
A.
3. Preconditioner formulation We define R
Ti∈ R
n×nias follows
R
Ti=
ψ
(i)1(x
1) ψ
(i)2(x
1) · · · ψ
(i)ni(x
1) ψ
(i)1(x
2) ψ
(i)2(x
2) · · · ψ
(i)ni(x
2)
.. . .. . · · · .. . ψ
(i)1(x
n) ψ
(i)2(x
n) · · · ψ
(i)ni(x
n)
. (4)
We note that R
Tiis the matrix representation of the point-wise interpolation operator from V
i, a coarser mesh with the basis (ψ
(i)1(x), . . . , ψ
(i)ni(x)), to V
h, the fine mesh with the basis (ψ
1(x), . . . , ψ
n(x)). Unlike the traditional AS method, the matrix R
Tidoes not consist of just 0 and 1 entries. For the columns associated with the nodal points outside Ω
i, there could be multiple nonzero entries belong to (0, 1). However, for other columns (the majority), there is only one nonzero entry (1); and this entry corresponds to a nodal point inside Ω
i.
Now we introduce the local sti ff ness matrix A
i∈ R
ni×niassociated with the bilinear form a(·, ·) restricted on the subspace V
i, as follows
A
i(k, j) = a(ψ
(i)j, ψ
(i)k) = a
n
X
l1=1
R
Ti(l
1, j)ψ
l1,
n
X
l2=1
R
Ti(l
2, k)ψ
l2
=
n
X
l2,l1=1
R
i(k, l2)A
l2,l1R
Ti(l
1, j).
This implies that
A
i= R
iAR
Ti. (5)
4
Clearly, A
iis symmetric and positive definite.
Next we define P
i= R
TiA
−1iR
iA. Since P
iA = AP
iand P
2i= P
i, we see that P
iis an A-orthogonal projection onto the range of R
Ti. Since R
Tirepresent the basis functions of V
i, cf. (4), P
icorresponds to a projection operator which is onto V
i.
Now we define our symmetric positive definite preconditioner P
−1=
N
X
i=1
R
TiA
−1iR
i.
Then the preconditioned system can be written as P
−1A =
N
X
i=1
P
i=
N
X
i=1
R
TiA
−1iR
iA.
Remark 1. Although the formulation of P
iand P
−1largely resemble that of the traditional AS methods, we emphasise that there is a fundamental di ff erence in the subspaces V
iin use. In the current approach, V
iare the finite element spaces associated with local meshes (T
i) of the whole domain Ω ; while in traditional AS methods, V
iare finite element spaces associated with the fine meshes (T
h|
Ω†i
) of subdomains ( Ω
†i) slightly larger than Ω
i(see [34, p. 59]). In addition, in the current approach, the coarse space V
0is contained in each V
iand there is no explicit coarse component in P
−1. For more information about traditional AS methods, we refer the reader to [34, 30] and references therein.
Remark 2. An advantage of P
−1over traditional AS preconditioners is the local matrix A
ican be assembled locally on each processor. Consequently, the global matrix A does not need to be assembled (to use in (5)). This is valuable in real-life applications where the system size is large.
Remark 3. Each restriction matrix R
ihas more rows than its counterpart in the traditional two-level AS precondi- tioner P e
−1ASassociated with the partitioning { Ω
†i}
Ni=1
and the coarse space V
0. In addition, the rows of R
iassociated with the coarse degrees of freedom (dofs) outside Ω
†iand the corresponding rows of e R
0in e P
−1ASare exactly the same.
This suggests an e ffi cient way of computing R
ias follows. Each processor independently computes rows of R
iasso- ciated with dofs in Ω ¯
†iand part of e R
0associated with its subdomain. Then the complete R
ican be obtained after an MPI Alltoall communication that exchanges the information of e R
0. With this implementation, the cost of evaluating R
i, i = 1, 2, . . . , N in P
−1is comparable with the cost of computing R e
i, i = 0, 1, . . . , N in e P
−1AS.
The preconditioner P
−1can be used to accelerate Krylov methods in solving the systems (3). Since P
−1and A are both symmetric positive definite the obvious choice is CG, the conjugate gradient method [23, 35].
In the next section, we will study the convergence of the CG method preconditioned by the proposed precondi- tioner P
−1.
4. Convergence analysis
In the first phase of our analysis, we will formulate Euclidean orthogonal projections Q
icorresponding to P
iand study the spectrum of the preconditioned system P
−1A = P
Ni=1
P
ivia that of P
N i=1Q
i.
Let φ
1(x), φ
2(x), . . . , φ
n(x) be an a(·, ·)-orthonormal basis of V
h. Without loss of generality, we can assume that φ
1(x), φ
2(x), . . . , φ
n0(x) is an a(·, ·)-orthonormal basis of V
0. Denote
U =
φ
1(x
1) · · · φ
n(x
1) .. . · · · .. . φ
1(x
n) · · · φ
n(x
n)
, U
0=
φ
1(x
1) · · · φ
n0(x
1) .. . · · · .. . φ
1(x
n) · · · φ
n0(x
n)
,
It follows that U
TAU = I
n, U
0TAU
0= I
n0.
Lemma 1. Let Q
i= U
TAP
iU = U
−1P
iU. Then Q
iis an Euclidean orthogonal projection and it has block diagonal structure Q
i= diag(I
n0, Q ˆ
i), where Q ˆ
i∈ R
(ni−n0)×(ni−n0)is also an Euclidean orthogonal projections. In addition,
σ(P
−1A) = σ(
N
X
i=1
Q
i) = { N } ∪ σ(
N
X
i=1
Q ˆ
i). (6)
where σ(·) denotes the spectrum.
P roof . Since Q
2i= Q
iand Q
Ti= Q
i, Q
iis an Euclidean orthogonal projection. In addition, as V
0⊂ V
iand the columns of U
0and R
Tirepresent basis functions of V
0and V
irespectively, we see that range(U
0) ⊂ range(R
Ti) = range(P
i).
Therefore, we can write P
iU = P
i[U
0∗] = [P
iU
0∗] = [U
0∗] and Q
i= U
TAP
iU =
"
U
0T∗
#
A [U
0∗] =
"
U
0TAU
0∗
∗ ∗
#
=
"
I
n0Z
iZ
iTQ ˆ
i# .
Since Q
2i= Q
i, it implies that Z
iZ
Ti= 0, or Z
i= 0. Therefore, Q
i= diag(I
n0, Q ˆ
i). As Q
iis an orthogonal Euclidean projection, ˆ Q
iis also an orthogonal Euclidean projection. The first part of (6) follows from the fact that
N
X
i=1
Q
i= U
−1
N
X
i=1
P
i
U = U
−1(P
−1A)U.
The second part of (6) is a consequence of P
Ni=1
Q
i= diag(NI
n0, P
N i=1Q ˆ
i).
Lemma 2. Let λ ˆ
minand λ ˆ
maxbe the smallest and largest eigenvalues of P
Ni=1
Q ˆ
irespectively. Then
σ
A(P
−1A) ⊂ [ ˆ λ
min, λ ˆ
max] ∪ {N}, where 0 < λ ˆ
min≤ λ ˆ
max≤ N. (7) P roof . Since ˆ Q
iis a projection, σ( ˆ Q
i) = {0, 1} and σ( P
Ni=1
Q ˆ
i) ⊂ [0, N]. Because P
−1and A are both positive definite, λ ˆ
min> 0. Then (7) follows from (6).
Remark 4. The result presented in (7) indicates that λ ˆ
minand λ ˆ
maxare actually the smallest and the second largest eigenvalues of the preconditioned system P
−1A. The eigenvalue λ ˆ
maxequals N if and only if the local subspace V
ihas common subset strictly larger than V
0. This only happens when N is small and local meshes are structured. In general, N > λ ˆ
maxand N is an isolated eigenvalue of P
−1A.
In the next step, we will take advantage of the special spectrum decomposition in (6) to study the convergence of the CG method applied to the preconditioned system P
−1A. But first, we quote from [1] the following result
ke
kk
Ake
0k
A= inf
q∈Pk
kq(P
−1A)e
0k
Ake
0k
A≤ inf
q∈Pk
max
λ∈σ(P−1A)
|q(λ)|. (8)
Here e
k= u
k− u is the exact error at the step n of the CG method, σ(P
−1A) denotes the spectrum of P
−1A, and P
kis the set of polynomials q of degree k or less, with q(0) = 1. More details about the CG method can be found in [35, 23]
and the references therein.
Theorem 3. The error of the CG method applied to equation (3) when it is left-preconditioned by P
−1satisfies ke
kk
Ak e
0k
A≤ 2(N − λ ˆ
min) N
√ ˆ κ − 1
√ ˆ κ + 1
k−1
< 2
√ ˆ κ − 1
√ ˆ κ + 1
k−1
, (9)
where κ ˆ = λ ˆ
max/ λ ˆ
minis called the e ff ective condition number of P
−1A.
6
P roof . By (8), it is sufficient to find a polynomial q(x) ∈ P
kwhose maximum value for x ∈ [ ˆ λ
min, λ ˆ
max] is the second quantity in (9). Consider the polynomial
q(x) = T
k−1(γ −
ˆ 2xλmax−λˆmin
)(N − x)
NT
k−1(γ) , (10)
where γ = ( ˆ λ
max+ λ ˆ
min)/( ˆ λ
max− λ ˆ
min) > 1 and T
k−1(x) is the Chebyshev polynomial of degree k − 1. More information about Chebyshev polynomials can be found in [27]. Clearly, q has degree k and q(0) = 1.
For x ∈ [ ˆ λ
min, λ ˆ
max], the quantity γ −
ˆ 2xλmax−λˆmin
belongs to [−1, 1] and | N − x | ≤ N − λ ˆ
min. It follows that
T
k−1γ − 2x λ ˆ
max− λ ˆ
min! (N − x)
≤ N − λ ˆ
min. (11)
We use the standard estimate for T
k−1(x):
T
k−1(γ) = 1 2
√ κ ˆ + 1
√ κ ˆ − 1
k−1
+
√ κ ˆ + 1
√ κ ˆ − 1
−(k−1)
≥ 1 2
√ κ ˆ + 1
√ κ ˆ − 1
k−1
. (12)
More details can be found in [35, p. 300]. The inequalities (9) then follow immediately from (11) and (12).
We have shown in Theorem 3 that the convergence of the CG method with preconditioner P
−1can be bounded by quantities mainly depend on the ratio of ˆ λ
minand ˆ λ
max, the second largest and smallest eigenvalues of P
−1A. In the next step, we present estimates for these eigenvalues.
4.1. Second largest eigenvalue estimate
Our plan to estimate ˆ λ
maxis to seek an explicit formula for ˆ Q
iand compare the largest eigenvalue of P
Ni=1
Q ˆ
iwith that of the related traditional AS method. We begin with some preparation.
Let ˆ V
ibe the subspace of V
ispanned by nodal basis functions associated with nodal points which are in T
ibut are not in T
H. With a slight abuse of notation we can write
V ˆ
i= span ψ
(i)j(x), ∀ j s.t x
j< T
HClearly, V
i= V
0⊕ V ˆ
i. This is a hierarchical decomposition of V
iinto subspace V
0of coarse basis functions and subspace V ˆ
iof fine basis functions. We quote from [7] (see also [22]) the following well-known result of the strengthened Cauchy-Schwarz inequality for hierarchical bases.
Lemma 4. Given the finite element hierarchical decomposition V
i= V
0⊕ V ˆ
i. Then for all v
0(x) ∈ V
0and all v ˆ
i(x) ∈ V ˆ
i:
| a(v
0, v ˆ
i) | ≤ γ k v
0k
Ak v ˆ
ik
A, i = 1, . . . , N. (13) Here the constant γ, 0 < γ < 1, (the maximum of all the constants associated with local meshes T
i) depends on the shape regularity quality of the meshes T
H, T
i, but is otherwise independent of the mesh sizes h and H.
Now let m
i= n
i− n
0and ω
(i)1(x), · · · , ω
(i)mi(x) be an a(·, ·)-orthonormal basis of ˆ V
i. Denote
W
i=
ω
(i)1(x
1) · · · ω
(i)mi(x
1) .. . · · · .. . ω
(i)1(x
n) · · · ω
(i)mi(x
n)
.
We note that the columns of U
0and the columns of W
irepresent bases of V
0and ˆ V
irespectively. Therefore, range(P
i) = range(R
Ti) = range([U
0W
i]) since V
0⊕ V ˆ
i= V
i.
Lemma 5. Let U
TAW
i= [X
iTY
iT]
T, where X
i∈ R
n0×mi, Y
i∈ R
n−n0×mi. Then Q ˆ
i= Y
i(Y
iTY
i)
−1Y
iT, for i = 1, . . . , N.
P roof . Since Q
i= U
TAP
iU and U is non-singular, we have
range(Q
i) = U
TA(range(P
i)) = U
TA(range([U
0W
i])) = range(U
TA[U
0W
i])
= range "
I X
i0 Y
i#!
= range "
I 0 0 Y
i#!
= range(E
i), (14)
where E
i= diag(I, Y
i). So Q
iis an projection onto the range of E
i. In addition, n
i= dim(V
i) = rank([U
0W
i]) = rank
"
I X
i0 Y
i#!
= rank "
I 0 0 Y
i#!
.
Therefore, rank(Y
i) = n
i− n
0= m
i. In other words, the matrix Y
ihas full rank. It follows that the columns of E
iare linearly independent. This together with (14) imply
Q
i= E
i(E
iTE
i)
−1E
iT= "
I 0
0 Y
i(Y
iTY
i)
−1Y
iT# .
Then the desired equality follows from the fact that Q
i= diag(I
n0, Q ˆ
i).
Lemma 6. For X
i, Y
idefined in Lemma 5, we have
(1 − γ
2)I Y
iTY
i, (15)
where 0 < γ < 1 is the constant introduced in Lemma 4. The notation denotes the positive semi-definite ordering (cf. [25]). In addition,
N
X
i=1
Y
iY
iTN
cI
n−n0. (16)
P roof . Using the definitions of X
i, Y
iand the fact that W
ihas A-orthonormal columns, we have X
TiX
i+ Y
iTY
i= [X
iTY
iT]
"
X
iY
i#
= W
iTAUU
TAW
i= W
iTAW
i= I
mi. (17) Therefore, in order to show (15) we will bound X
iTX
ifrom above.
For v
0(x) ∈ V
0and ˆ v
i(x) ∈ V ˆ
i, their coordinate vectors are of the following forms v
0= U
"
y 0
#
, v ˆ
i= [U
0W
i]
"
0 z
#
, y ∈ R
n0, z ∈ R
mi.
Now the inequality (13) can be written in the matrix form as follows [y
T0]U
TA[U
0W
i]
"
0 z
#
≤ γ [y
T0]U
TAU
"
y 0
#!
[0 z
T]
"
U
0TW
iT#
A[U
0W
i]
"
0 z
#!
. Equivalently for any y ∈ R
n0and z ∈ R
mi: [y
T0]
"
I X
i0 Y
i# "
0 z
#
= y
TX
iz ≤ γ k y k
2k z k
2. This implies that k X
ik
2≤ γ and k X
iTX
ik
2≤ γ
2. In other words, X
iTX
iγ
2I
mi. Then (15) follows immediately from (17).
Next we are going to prove (16). Let V
i†= V
h|
Ω†i
, i = 1, . . . , N. We note that V
i†are the local spaces in the related traditional AS method (see [34, p. 59]). Since all elements in T
ithat are outside of Ω
†ibelong to T
H, ˆ V
iis a subset of V
i†. Consequently, there is an orthonormal basis of V
i†in the form of ω
(i)1, . . . , ω
(i)mi, ω
(i)mi+1
, . . . , ω
(i)m˜i
. Let W e
i∈ R
n×emibe defined as follows
W e
i=
ω
(i)1(x
1) · · · ω
(i)mei
(x
1) .. . · · · .. . ω
(i)1(x
n) · · · ω
(i)mei
(x
n)
.
8
Denote [e X
iTe Y
iT] = U
TA W e
i, where e X
i∈ R
n0×mei, Y
i∈ R
n−n0×emi. Then the first m
icolumns of e Y
iform Y
i. Assume Y
i= [y
i1· · · y
imi] and e Y
i= [y
i1· · · y
imiy
imi+1
· · · y
iemi
]. For any z ∈ R
n−n0we have z
T
N
X
i=1
Y
iY
iT
z =
N
X
i=1 mi
X
j=1
(y
ijTz)
2≤
N
X
i=1 emi
X
j=1
(y
ijTz)
2= z
T
N
X
i=1
Y e
ie Y
iT
z. (18)
Therefore,
N
X
i=1
Y
iY
iTN
X
i=1
e Y
ie Y
iT. (19)
Now let Q e
ibe the Euclidean orthogonal projection corresponding to the Schwarz projection e P
iassociated with Ω
†iin the traditional AS method (see [34, chapter 2]). Similar to (14), we have range( Q e
i) = range(U
TA W e
i). In addition, F
i= U
TA W e
i= [e X
Tie Y
iT] has orthonormal columns. Thus the projection Q e
ican be written as
Q e
i= F
iF
Ti=
"
e X
ie X
iTe X
iY e
iTe Y
iX e
iTe Y
ie Y
iT# .
Therefore, for any z ∈ R
n−n0z
TN
X
i=1
e Y
ie Y
iTz = [0 z
T]
N
X
i=1
Q e
i"
0 z
#
≤ ρ(
N
X
i=1
Q e
i) z
Tz = ρ(
N
X
i=1
P e
i) z
Tz, where ρ denote the spectral radius. On the other hand, according to [21, Theorem 4.1], ρ( P
Ni=1
e P
i) ≤ N
c. Consequently,
e Y
ie Y
iTN
cI
n−n0. (20)
The ordering (16) then follows from (19) and (20).
We now present one of our main results, the estimate for the second largest eigenvalue.
Theorem 7. The second largest eigenvalue of the preconditioned system P
−1A is bounded as follows λ ˆ
max≤ N
c(1 − γ
2) . (21)
P roof . From (5), we have ˆ λ
max= ρ( P
Ni=1
Q ˆ
i) = ρ P
Ni=1
Y
i(Y
iTY
i)
−1Y
iT. On the other hand, it follows from (16) and (15) that
N
X
i=1
Y
i(Y
iTY
i)
−1Y
iT1 (1 − γ
2)
N
X
i=1
Y
iY
iTN
c(1 − γ
2) I
n−n0. Then the equality (21) follows immediately.
4.2. Smallest eigenvalue estimate
Our estimate of ˆ λ
minfollows the standard approach where a stable decomposition is constructed [37, 21, 34].
However, as the local meshes T
iare meshes of the whole domain and they are very di ff erent from one another and
from the global fine mesh T
houtside of their associated subdomains, the stable decomposition in [21, 34] is no longer
valid. In order to adapt to the situation, we build our stable decomposition inductively on the colouring defined in
section 2. In our construction, the partition of unity is replaced by a set of cut-o ff functions, and the point-wise
interpolation is replaced by a special interpolation inspired by [31].
Cut-off functions. Denote C
kthe set of indices of subdomains coloured by colour c
k, 1 ≤ c
k≤ N
c. Then for each subdomain Ω
i, i ∈ C
k, we define the cut-o ff function θ
i(ck)(x) as follows:
θ
(cik)(x) =
1 if x ∈ Ω ¯
i0 if x < Ω ¯
†idist(x,∂Ω†i\∂Ω)
dist(x,∂Ω†i\∂Ω)+dist(x,∂Ωi\∂Ω)
if x ∈ Ω
†i\ Ω
i,
(22)
Clearly, θ
(cik)is well-defined, continuous on ¯ Ω and satisfies
0 ≤ θ
(cik)(x) ≤ 1, for all x ∈ Ω ¯ . (23)
In addition,
supp(θ
i(ck)) ⊂ Ω ¯
†i, supp(θ
(cik)) ∩ supp(θ
(cjk) = ∅, i, j ∈ C
k, i , j. (24) Since the width of Ω
†i\ Ω
iis of size H, according to [34, Lemma 3.4], there exists constant C
θdoes not depend on i and H such that
k∇θ
(cik)k
∞≤ C
θ/H. (25)
In the next step, we present the framework to construct the modified Lagrange type interpolation operator intro- duced by Scott and Zhang in [31]. Some stability properties for this type of interpolation will also be provided for later use.
Modified Lagrange interpolations. Let T
◦be a finite element mesh of Ω with its set of nodal points N
◦= {x
j◦}
nj=◦1. Denote V
◦the finite element space associated with T
◦and let {ψ
◦j}
n◦j=1
be the set of linear nodal basis functions of V
◦corresponding to N
◦. For any node x
◦j, we fix an edge e
◦jin T
◦that has x
◦jas one of its vertex. Let { x
◦j,k}
2k=1
be the two nodal points in N
◦associated with e
◦j. Without lost of generality, we choose x
◦j,1= x
◦j. For the nodal basis {ψ
◦j,k}
2k=1associated with { x
◦j,k}
2k=1
, we have an L
2(e
◦j)-dual basis {η
◦j,k}
2k=1
defined by R
e◦j
η
◦j,kψ
◦j,l= δ
kl, k, l = 1, 2, where δ
k,lis the Kronecker delta. For simplicity, we let η
◦j≡ η
◦j,1, for x
◦j∈ N
i. Then, we have
Z
e◦j
η
◦jψ
◦k= δ
jk, k, j = 1, 2, . . . , n
◦. (26) Now we can define the interpolation operator,
I
◦= I
{e◦ j}
T◦
: H
1( Ω ) → V
◦, I
◦u(x) =
ni
X
j=1
ψ
◦j(x) Z
e◦j
η
◦j(ξ)u(ξ) dξ. (27)
Here, the notation I
{e◦ j}
T◦
is used to emphasise that the interpolation operator depends on the mesh T
◦and the choice of edges {e
◦j}
nj=◦1. However, for simplicity I
◦is used in other places.
The following Lemma is useful when we want to consider I
◦u on a subset of Ω .
Lemma 8. Let u be a function in H
1( Ω ) and Ω
sbe a subset of Ω . Assume that Ω
sis also an union of elements in T
◦. Then following statement holds
I
◦u(x) = X
j,x◦j∈Ω¯s
ψ
◦j(x) Z
e◦j
η
◦j(ξ)u(ξ) dξ, for all x ∈ Ω ¯
s.
P roof . The proof is obvious as the basis functions ψ
◦j(x) associated with x
◦j< Ω ¯
svanish in ¯ Ω
s.
Let {x
(i)j}
nj=i1be the set of nodal points of the finite element mesh T
i, 0 ≤ i ≤ N. For each mesh T
i, 0 ≤ i ≤ N we will choose a set of edges {e
(i)j}
nj=i1in T
icorresponding to {x
(i)j}
nj=i1that satisfies the following conditions:
10
(i) e
(i)jcontains x
(i)j(ii) e
(i)j∈ ∂ Ω , if x
(i)j∈ ∂ Ω
(iii) e
(i)j∈ ∂ Ω
i\∂ Ω , if x
(i)j∈ ∂ Ω
i\∂ Ω , i , 0
(iv) e
(i)j∈ ∂ Ω
k, if x
(i)j< ∂ Ω ∪ ∂ Ω
iis shared by two or more subdomains in the partition { Ω
l}
Nl=1
. Here Ω
kis the subdomain with smallest colour that contains x
(i)j.
For each mesh T
i, we fix a choice of edges {e
(i)j}
nj=i1satisfying the four conditions above. Then we let I
ih,H= I
{e(i) j}
Ti
: H
1( Ω ) → V
i, 1 ≤ i ≤ N I
H= I
{e(0) j }
T0
: H
1( Ω ) → V
0,
be the modified Lagrange interpolation operators associate with T
iand { e
(i)j}
nij=1
, and with T
0and { e
(0)j}
n0j=1
respectively.
According to [31], there exist a constant C
Idepend only on the shape regularity of the associated meshes such that kI
h,Hiuk
H1(K)≤ C
I|u|
H1(ωK), K, ω
K∈ T
i, (28) ku − I
Huk
L2(K)≤ C
IH|u|
H1(ωK), K, ω
K∈ T
H, (29) kI
Huk
H1(K)≤ C
I|u|
H1(ωK), K, ω
K∈ T
H. (30) where ω
K= interior S { K ¯
j| K ¯
j∩ K ¯ , ∅, K
i∈ T
◦}
.
Lemma 9. The interpolation operator I
ih,Hpreserves fine functions in the regions where the mesh T
iis fine. In other words,
I
h,Hiu|
Ω¯i= u|
Ω¯i, for any function u(x) satisfies u(x) |
Ω¯i
∈ V
h|
Ω¯i
.
P roof . Let x
(i)jbe a nodal point of T
i, x
(i)j∈ Ω ¯
i. Since T
i|
Ωi≡ T
h|
Ωi, this nodal point also presents in T
h. In addition, the two nodal basis functions associated with x
(i)jin V
iand V
hare identical on ¯ Ω
i, namely
ψ
(i)j|
Ωi
= ψ
ji|
Ωi
. (31)
On the other hand, by (iii) the chosen edge e
(i)j∈ T
ifor the nodal point x
(i)jshould also be an edge in T
hif x
(i)j∈ Ω ¯
i. Therefore, by (26) we have
Z
e(i)j
η
j(ξ) u(ξ) dξ = u(x
j), for all x
(i)j∈ Ω ¯
i. (32) Using (27), Lemma 8, (32) and (31), we have
I
ih,Hu(x) =
ni
X
j=1
ψ
(i)j(x) Z
e(i)j
η
(i)j(ξ)u(ξ) dξ = X
j,xj∈Ω¯i
ψ
(i)j(x) Z
e(i)j
η
(i)j(ξ)u(ξ) dξ = X
j,xj∈Ω¯i
ψ
(i)j(x) u(x
j) = u(x).
We are now in a position to estimate the smallest eigenvalue of the preconditioned system P
−1A. The idea is to
construct local functions colour by colour. The proposed interpolations will ensure that residual functions vanish on
all considered subdomains, and stay zero there in later induction steps. The following Lemma lays the foundation for
our construction of local functions in a stable decomposition.
Lemma 10. Assume u(x) ∈ V
h. Let u
(0)(x) := u(x). Then our inductive construction of residual functions u
(k)(x) is as follows
w
(k)= I
Hu
(k−1), (w
(k)∈ V
H) (33)
v
(k)= u
(k−1)− w
(k), (v
(k)∈ V
h) (34)
v
(k)i= I
ih,Hθ
(ci k)v
(k), (v
(k)i∈ V
i). (35) u
(k)= v
(k)− X
i∈Ck
v
(k)i= v
(k)− X
i∈Ck
I
ih,Hθ
i(ck)v
(k), (u
(k)∈ V
h) (36) where k = 1, 2, . . . , N
c. Then the following equalities hold
u
(k)|
Ω¯i≡ 0, for all i ∈ C
ki, k
i≤ k, (37) u =
Nc−1
X
k=0
w
(k)+
Nc
X
k=1
X
i∈Ck
v
(k)i, (38)
X
i∈Ck
v
(k)i2
H1(Ω)
= X
i∈Ck
v
(k)i2
H1(Ω)
. (39)
P roof . Substituting k = 1 into (36) gives u
(1)= v
(1)− P
i∈C1
I
ih,Hθ
(ci1)v
(1). For i, j ∈ C
1, i , j, according to (22), θ
(ci1)= 1 on ¯ Ω
i, and θ
(cj1)= 0 on ¯ Ω
i. Therefore, I
h,Hiθ
(ci1)v
(1)= I
ih,Hv
(1)= v
(1)on ¯ Ω
i, i ∈ C
1as a consequence of Lemma 9. In addition, I
h,Hjθ
(cj1)v
(1)≡ I
h,Hj0 = 0 on ¯ Ω
i. Combining these together, we have
u
(1)|
Ω¯i≡ 0, for all i ∈ C
1. (40)
For any x ∈ Ω ¯
i, i ∈ C
1from (33) and Lemma 8, it follows that w
(2)(x) = I
Hu
(1)(x) = X
j,x(0)j ∈Ω¯i
ψ
(0)j(x) Z
e(0)j
η
(0)j(ξ) u
(1)(ξ) dξ (41)
By condition (iv), e
(0)j∈ Ω ¯
ifor all x
(0)j∈ Ω ¯
i, i ∈ C
1. This together with (40) imply
w
(2)|
Ω¯i≡ 0, for all i ∈ C
1. (42)
Then from (34), (40) and (42), it follows that
v
(2)|
Ω¯i≡ 0, for all i ∈ C
1. (43)
Substituting k = 2 into (36), we obtain u
(2)= v
(2)− P
i∈C2
I
h,Hiθ
(ci2)v
(2). Similarly, we have
u
(2)|
Ω¯i≡ 0, for all i ∈ C
2. (44)
Now assume l ∈ C
1. For any x ∈ Ω ¯
l, i ∈ C
2according to Lemma 8, I
ih,Hθ
(ci2)v
(2)(x) = X
j,x(i)j∈Ω¯l
ψ
(i)j(x) Z
e(i)j
η
(i)j(ξ)(θ
(ci2)v
(2))(ξ) dξ. (45)
On the right hand side of (45), if x
(i)j∈ Ω ¯
l\∂ Ω
ithen by condition (iv), e
(i)j∈ ∂ Ω
l⊂ Ω ¯
l. This together with (43) imply R
e(i)j
η
(i)j(ξ)(θ
(ci2)v
(2))(ξ) dξ = 0. If x
(i)j∈ Ω ¯
l∩ ∂ Ω
ithen by condition (iii), e
(i)j∈ ∂ Ω
i. From (26), (22), (43) and the fact
12
that x
(i)j∈ Ω ¯
l, we have R
e(i)j
η
(i)j(ξ)(θ
(ci2)v
(2))(ξ) dξ = θ
(ci2)v
(2)(x
(i)j) = v
(2)(x
(i)j) = 0. In summary, I
ih,Hθ
(ci 2)v
(2)= 0 on ¯ Ω
l, for all l ∈ C
1, i ∈ C
2. This together with (43) imply u
(2)|
Ω¯l
≡ 0, for all l ∈ C
1. From (44), it follows that u
(2)|
Ω¯i≡ 0, for all i ∈ C
1∪ C
2.
Continuing this process for k = 3, . . . , N
c, we obtain (37).
Since { Ω ¯
i}
Ni=1covers Ω , (37) implies u
(Nc)|
Ω≡ 0. Tracing backward, we have 0 = u
(Nc)= u
(Nc−1)− w
(Nc)− X
i∈CNc
v
(Ni c)= u
(Nc−2)− w
(Nc−1)− w
(Nc)− X
i∈CNc−1
v
(Ni c−1)− X
i∈CNc
v
(Ni c)= u
(0)−
Nc
X
k=1
w
(k)−
Nc
X
k=1
X
i∈Ck
v
(k)i.
This implies (38) because u
(0)(x) = u(x).
Since θ
(cik)has support on ¯ Ω
†i, the functions θ
(cik)v
(k)and consequently v
(k)i= I
ih,Hθ
(cik)v
(k)also have support on ¯ Ω
†i. Therefore, v
(k)ihave disjoint supports, and (39) follows immediately.
Now we are ready to state the main result of this subsection.
Theorem 11. For any u(x) ∈ V
hthere exists a decomposition
u =
N
X
i=1
u
i, u
i(x) ∈ V
i, 1 ≤ i ≤ N, that satisfies
N
X
i=1
a(u
i, u
i) ≤ C
ma(u, u),
where C
mis a constant independent of H, h and N but not N
c. In addition, the smallest eigenvalue of the precondi- tioned system P
−1A can be bounded from below as follows
λ ˆ
min≥ C
−1m.
P roof . In this proof, for simplicity, we use x . y to denote x ≤ C y, where the constant C might depend on the interpolation constant, the constant in bounding the gradients of cut-o ff functions and the number of colours in the colouring (C
I, C
θand N
crespectively) but does not depend on the mesh sizes (h, H) and the number of subdomains in the partition (N).
Based on (38) in Lemma 10, we define u =
N
X
i=1
u
i, where u
i= (
w
(ki)+ v
(ki i), if i = min(C
ki)
v
(kii), otherwise . (46)
We will show that this is a stable decomposition.
First, from the definition of w
(k)in (33) and the stability properties of I
Hin (30), it follows that |w
(k)|
H1(K)≤ C
I|u
(k−1)|
H1(ωK), for K and ω
K∈ T
0. Squaring and summing over all K ∈ T
0, we have
| w
(k)|
2H1(Ω)
. | u
(k−1)|
2H1(Ω)