HAL Id: inria-00392881
https://hal.inria.fr/inria-00392881
Submitted on 9 Jun 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Factorization and Tangential Filtering preconditioner
Pawan Kumar, Laura Grigori, Frédéric Nataf, Qiang Niu
To cite this version:
Pawan Kumar, Laura Grigori, Frédéric Nataf, Qiang Niu. Combinative preconditioning based on
Relaxed Nested Factorization and Tangential Filtering preconditioner. [Research Report] RR-6955,
INRIA. 2009. �inria-00392881�
a p p o r t
d e r e c h e r c h e
9
-6
3
9
9
IS
R
N
IN
R
IA
/R
R
--6
9
5
5
--F
R
+
E
N
G
Thème NUM
Combinative preconditioning based on Relaxed
Nested Factorization and Tangential Filtering
preconditioner
Pawan Kumar — Laura Grigori — Frederic Nataf — Qiang Niu
N° 6955
Juin 2009
Centre de recherche INRIA Saclay – Île-de-France
pre onditioner PawanKumar∗
, Laura Grigori†
, Frederi Nataf‡
, Qiang Niu§
ThèmeNUMSystèmesnumériques Équipe-ProjetGrand-large
Rapportdere her he n°6955Juin200924pages
Abstra t: Theproblemofsolvingblo ktridiagonallinearsystemsarisingfrom thedis retizationofPDEis onsidered. Thenestedfa torizationpre onditioner introdu edby[J.R. Appleyardand I.M.Cheshire,NestedFa torization,SPE 12264,presentedattheSeventhSPESymposiumonReservoirSimulation,San Fran is o,1983℄isanee tivepre onditionerfor ertain lassofproblemsanda similarmethodisimplementedin S hlumerger'sE lipseoilreservoirsimulator. In this paper, a relaxed version of Nested Fa torization pre onditioner is proposed as a repla ementto ILU(0). Indeed, the proposed pre onditioner is SPDand leadsto astable splittingiftheinput matrixisS.P.D.. ForILU(0), equivalent properties hold ifthe inputmatrix isa M-matrix. Moreoverit has no storage ost. Ee tive multipli ative/additive pre onditioning is a hieved in ombination with Tangential ltering pre onditioner with the lter ve tor hosen as ve tor of
ones
. Numeri al tests are arried out with both additive and multipli ative ombinations. With thissetup thenewpre onditionerisas robustasthe ombinationofILU(0)withtangentiallteringpre onditioner. Key-words: pre onditioner,linearsystem,frequen ylteringde omposition, GMRES,nestedfa torization,in ompleteLU,eigenvalues∗
INRIA Sa lay - Ilede Fran e, Laboratoire de Re her he enInformatique, Universite Paris-Sud11,Fran e(Email:pawan.kumarlri.f r)
†
INRIA Sa lay - Ilede Fran e, Laboratoire de Re her he enInformatique, Universite Paris-Sud11,Fran e(Email:laura.grigoriinr ia.f r)
‡
Laboratoire J. L. Lions, CNRS UMR7598, Universite Paris 6, Fran e; Email: natafann.jussieu.fr
§
S hoolofMathemati alS ien es,XiamenUniversity,Xiamen,361005,P.R.China;The workofthisauthorwas performedduringhisvisittoINRIA,fundedbyChina S holarship Coun il;Email:kangniugmail. om
Résumé : Dans e papier nous présentonsune version de fa torisation em-boîtéave relaxation. Cettefa torisationestproposée ommerempla ementde ILU(0). Lepré onditionneurproposéest SPDsi lamatri eoriginaleest SPD. Pour ILU(0)des propriété équivalentes existent si la matri e originaleest une M-matri e. Un pré onditionnemente a eest obtenu en ombinaisonave le ltragetangentielave leve teurdeltragede
ones
. Destestsnumériquessont présentés pour des ombinaisonsadditifset multipli atifs. Ils montrentque le pré onditionneurest aussirobustequela ombinaisonobtenueparILU(0)ave leltragetangentiel.Mots- lés : pré onditionnement, fa torisation emboîtée, fa torisation LU in omplète
1 INTRODUCTION
Several appli ations are modelled by non-linear partial dierential equations, examples in lude oil reservoir simulations and uid ow through porous me-dia. TheseequationsareusuallysolvedbytheNewton'smethod orxedpoint method,whereatea hstepapre onditionediterativemethodisgenerallyused forsolvingalargesparselinearsystem
Ax = b.
(1)Thelinearsystems hangeat ea hstep, andsolvingthem onstitutethemost omputationallyintensivepartofthesimulation. Therefore,ane ient pre on-ditionerwith fastsetuptime playsanimportantrole in theoverallsimulation pro ess. Also, sin ethe systemsare verylarge, the pre onditionershould not be too demanding in terms of memory requirements. This problem has al-readybeenextensivelystudied,see [7,16, 18℄forthedes riptionof algorithms and their omparisons. Inparti ular, one lassof methods whi h haveproved parti ularly su essful are the algebrai multigrid methods, see for example, [17, 19, 20, 24℄. However,for ertain problems the algebrai multigrid meth-ods involverelatively high setup ost, whi h leadsus to look into some other alternativesinthiswork.
Withfastsetuptimeandverymodeststoragerequirement,theNested Fa -torization (NF) pre onditioner introdu ed in [1℄ is a powerful pre onditioner; it has been found [1, 15℄ that for ertain lass of problems it performsbetter thanthewidelyusedILU(0)orModiedILU[18℄. NFtakesasinputamatrix whi h hasanestedblo ktridiagonal stru ture. Matri es arisingfrom the dis- retization ofP.D.E anbepermuted to thisform. Themethod of NFdiers fromILU(0)orModiedILU(0)inthatthepre onditioningmatrixinNFisnot formed stri tly from upper and lowertriangular fa tors. Instead, blo k lower andupperfa torsare onstru tedusingapro edurewhi haddsonedimension atatimetothepre onditioningmatrixhavingthediagonalmatrixonthelowest level.
TheNFpre onditionerhassomeimportantproperties. If
B
N F
istheNF pre- onditioner,thencolsum(B
N F
− A) = 0
(alsoknownaszero olsumproperty), asa onsequen ethesumoftheresidualsinsu essiveKryloviterationsremain zero,providedasuitableinitialsolutionisused[1℄. Thisproperty anprovidea veryuseful he konthe orre tnessoftheimplementation. Further,quoting[1℄, the fa torizationpro edure onservesmaterial exa tlyforea h phaseat ea h lineariteration,anda ommodatesnon-neighbour onne tions(arisingfromthe treatmentofthefaults, ompletingthe ir leinthree-dimensional oning stud-ies, numeri al aquifers, dual porosity/permeability systemset .) in anatural way. Moreover,foruidowproblemsitisprovedin[2℄thatthelowerandthe uppertriangularfa torsofNFarenonsingular. Duetothesedesirablequalities, the NF pre onditioner is of parti ular interest in the oil reservoirindustry; a methodsimilartoNFisimplementedinS hlumerger'sE lipseoilreservoir sim-ulator [12℄. However,aswewill see later in thetables, theNF pre onditioner anfailonsomeproblems.The main result of this paper is the introdu tion of a new pre onditioner whi h orrespondstoarelaxedversionofNF,namelyRNF(0,0). Comparedto
If
A
isS.P.D.then, RNF(0,0) isS.P.D.RNF(0,0) leadstoastablesplitting nosetup ost
nostoragerequirement
Thenew pre onditioner is parti ularly useful when multipli ate/additive pre- onditioningisa hievedin ombinationwithtangentiallteringpre onditioner [4℄. With this setup the new pre onditioner is as robust as the ombination ofILU(0)with tangentialltering pre onditioner. Notably bothmultipli ative andadditive ombinationaretried. Goodresultsareobtainedwiththeadditive ombinationwhi hmoreoverhasanadvantageovermultipli ative ombination onparallelsystems;ea hofthepre onditionersolvesinvolvedwiththeadditive ase anbedonesimultaneouslyontwodierentpro essors. Inorder tomake a omprehensive study, we onsider a general relaxed version of NF, namely RNF(
α
,β
).This arti le is organized as follows. Notations play an important role in explainingthemethodof NF;we arefullyintrodu ethem inthenextse tion. Inse tion3,NF isexplainedand laterin these tion RNFis introdu ed. The method of LFTFD is dis ussed briey in se tion 4. In se tion 5, we present onvergen eresultsforthe ombinationinvolvingRNF(0,0)andLFTFDwhi h shows that the ombination is onvergent. In se tion 6 we will omment on numeri alresults. Andnally,se tion7 on ludesthepaper.
2 NOTATIONS
Throughout the paper, matrix
A
is onsidered to be arising from the nite dieren e or nite volume dis retization of two or three dimensional partial dierentialequations. Forexample,using7-pointformulaforthreedimensional problemsforannx×ny ×nz
grid,wherenx
denotesthenumberofpointsonthe line,ny
denotesthenumberoflinesonea hplane,andnz
denotesthenumber ofplanes,weobtainan(nx × ny × ny) × (nx × ny × nz)
matrix. Letnxy
denotenx × ny
,thenumberofunknownsonea hplane,nyz
denoteny × nz
,thetotal numberoflinesonallplanes,andnxyz
denotenx × ny × nz
,thetotalnumber ofunknowns. Then theresulting matrixhasa nestedtridiagonalstru ture as follows:A =
b
D
1
U
b
3
1
b
L
1
3
D
b
2
. . . . . . . . .U
b
nz−1
3
b
L
nz−1
3
D
b
nz
.
(2)Herethediagonalblo ks
D
b
i
s
aretheblo ks orrespondingtotheunknowns ofthei
th
plane. Theblo ks
L
b
i
3
andU
b
i
3
arediagonal matri esofsizenxy
,and they orrespondto the onne tions betweenthei
th
and
(i + 1)
th
plane. F ur-ther, thediagonal blo ks
D
b
i
s
are themselvesblo ktridiagonal, i.e. they havethefollowingstru ture,
b
D
i
=
D
(i−1)∗ny+1
U
(i−1)∗ny+1
2
L
(i−1)∗ny+1
2
D
(i−1)∗ny+2
. . . . . . . . .U
i∗ny−1
2
L
i∗ny−1
2
D
i∗ny
where the blo ks
L
i
2
andU
i
2
are diagonal matri es of sizenx
ea h and they orrespondto the onne tions between thelines. The diagonal blo ksD
i
are themselvestridiagonalmatri es,D
i
=
˜
D
(i−1)∗nx+1
U
˜
1
(i−1)∗nx+1
˜
L
(i−1)∗nx+1
1
D
˜
(i−1)∗nx+2
. . . . . . . . .U
˜
i∗nx−1
˜
L
i∗nx−1
1
D
˜
i∗nx
with thes alars
L
˜
(i−1)∗nx+j
1
andU
˜
(i−1)∗nx+j
1
being the orresponding onne -tions between the ells. We observe that this matrix has nested tridiagonal stru ture. For any matrixK
,Diag(K)
refersto the stri t diagonal ofK
; for anyve torv
,Diag(v)
refersto thediagonalmatrix formedfrom theve torv
. ForanydiagonalmatrixM
of sizenxyz
,thei
th
entryof
M
is denotedbyM
˜
i
, thei
th
lineblo kof
M
isdenotedbyM
i
,andthei
th
planeblo kof
M
isdenoted byM
c
i
. Belowwelistthestru turesofthematri esL
1
,U
1
,L
2
,U
2
,L
3
,andU
3
tobereferredtolater.Diag(A) =
˜
D
1
˜
D
2
. . .˜
D
nxyz
,
M =
˜
M
1
˜
M
2
. . .˜
M
nxyz
L
1
=
0
˜
L
1
1
0
. . . . . .˜
L
nxyz−1
1
0
,
U
1
=
0
U
˜
1
1
0
. . . . . .U
˜
1
nxyz−1
0
L
2
=
0
L
1
2
. . . . . . . . . . . . . . .L
nyz−1
2
0
,
U
2
=
0 U
1
2
. . . . . . . . . . . . . . .U
nyz−1
2
0
L
3
=
0
b
L
1
3
. . . . . . . . . . . . . . .b
L
nz−1
3
0
,
U
3
=
0
U
b
1
3
. . . . . . . . . . . . . . .U
b
nz−1
3
0
3 RELAXED NESTED FACTORIZATION
Inthisse tion,weintrodu eRNFforthree-dimensional ase,thetwo-dimensional ase will be onsidered a spe ial ase of three-dimensional ase with just one plane. Werst introdu easpe ial versionof thepre onditioner for whi h we provethatitisS.P.Dwhentheinputmatrix
A
isS.P.D.Later, weintrodu ea moregeneralversionwhi hwillprovesigni antforthetwo-dimensional prob-lems.3.1 A spe ial ase : RNF(0,0)
Wedene theRNF(0,0)pre onditionerasfollows:
B
RN F
(0,0)
=
(P + L
3
)(I + P
−1
U
3
)
P
=
(T + L
2
)(I + T
−1
U
2
)
T
=
(M + L
1
)(I + M
−1
U
1
)
M
=
diag(A) .
3.2 The general ase : RNF(
α
,β
)Thegeneral version of relaxed nestedfa torization pre onditioner denoted by
B
RN F
(α,β)
=
(P + L
3
)(I + P
−1
U
3
),
P
=
(T + L
2
)(I + T
−1
U
2
),
T
=
(M + L
1
)(I + M
−1
U
1
),
M
=
diag(A) − α L
1
M
−1
U
1
−
β colsum(L
2
T
−1
U
2
)
−
β colsum(L
3
P
−1
U
3
),
where
colsum(K) = Diag(1K)
for any square matrixK
, and1
denotes the ve tor[1, 1, ..., 1]
. Note that withα = 1
andβ = 1
wegetthe lassi alNested Fa torizationpre onditionerasdes ribedin [1℄.ForRNF(1,1)weobservethat
B
RN F(1,1)
− A = L
2
T
−1
U
2
− colsum(L
2
T
−1
U
2
) + L
3
P
−1
U
3
− colsum(L
3
P
−1
U
3
)
sothatthepre onditionersatises
1
B
RN F
(1,1)
= 1A,
(3)i.e.,a olumnsum onstraintasin MILU[18℄.
Thematrix
M
is omputedbeforetheiterationbegins. FindingM
requires writing down the above expression forM
. We noti e thatM
is a diagonal matrix, sin eDiag(A)
,L
1
M
−1
U
1
,colsum(L
2
T
−1
U
2
)
, andcolsum(L
3
P
−1
U
3
)
arediagonalmatri es. Thematri es
T
andP
areblo kdiagonalmatri eswith squarediagonalblo ksofsizesnx
andnxy
respe tively. WehaveL
1
M
−1
U
1
=
0
B
B
B
@
0
˜
L
1
1
M
˜
−1
1
U
˜
1
1
. . .˜
L
nxyz−1
1
M
˜
−1
nxyz−1
U
˜
nxyz−1
1
1
C
C
C
A
,
T
=
0
B
B
B
@
T
1
T
2
. . .T
nyz
1
C
C
C
A
,
colsum(L
2
T
−1
U
2
)
=
0
B
B
B
B
@
0
colsum(L
1
2
T
−1
1
U
1
2
)
. . .colsum(L
nyz−1
2
T
nyz−1
−1
U
nyz−1
2
)
1
C
C
C
C
A
,
P
=
0
B
@
P
1
. . .P
nz
1
C
A , and
colsum(L
3
P
−1
U
3
)
=
0
B
B
B
@
0
colsum(b
L
1
3
P
−1
1
U
b
1
3
)
. . .colsum(b
L
nz−1
3
P
−1
nz−1
U
b
nz−1
3
)
1
C
C
C
A
.
3.3 Constru tion of RNF(
α, β
) pre onditionerAdetailedpseudo odeforthe onstru tionof
RN F (α, β)
pre onditioneris pro-videdinAlgorithm(1), webrieyoutlinethe onstru tioninthis se tion. The entries ofM
are al ulated in a single sweep through the grid. Reading the entries rowbyrowfortheexpressionofM
,we andetermineM
.Therstrowgives
M
˜
1
=D
˜
1
. Forthese ond ellonwardsuptothelast ell ofrstlinethereare ontributionsfromthetermL
1
M
−1
U
1
,andwehave˜
M
i
= ˜
D
i
− ˜
L
1
i−1
U
˜
1
i−1
/ ˜
M
i−1
, i = 2, . . . , nx.
Whentheupdatefromtherstlineisnished,ea helementof
M
forthese ond linealsodependsonthepreviouslinethroughthetermcolsum(L
2
T
−1
U
2
)
,and ea h element ofM
for se ond plane onwards depends on the previous plane through the termcolsum(L
3
P
−1
U
3
)
. The
(j + 1)
th
line blo k of the blo k diagonalmatrix
colsum(L
2
T
−1
U
2
)
, anbe omputedasfollows:
colsum(L
j
2
T
−1
j
U
j
2
) = Diag(
1L
j
2
T
j
−1
U
j
2
) = Diag((U
j
2
)
T
T
j
−T
(L
j
2
)
T
1).
Asimilar al ulation anbeperformedforthe omputationoftheplaneblo ks of
colsum(L
3
P
−1
U
3
)
.3.4 Solution pro edure
Inthisse tionwewill seehowthenestedexpressionfor
B
isusedto solvethe equationB
RN F
(α,β)
u
=v
requiredin apre onditioned Kryloviterativemethod. At theoutermostlevel, wesolve
(P + L
3
)(I + P
−1
U
3
)u = v
doingthe forwardsweepsof theform
q = P
−1
i
(v − b
L
i−1
3
q)
, and theba kward sweepsofthe formu = u − P
−1
i
U
b
3
i
q.
These equationsare solvedfor oneplane at a time. The solution involves solving equations of the formP
i
w = x
for ea hplane. Tosolvethisequationweusethefa t thatea hplaneblo kofthe pre onditionerhasanin ompleteblo kfa torizationrepresentedbyP = (T + L
2
)(I + T
−1
U
2
),
andwesolve
P
i
w = x
usingtheforwardsweepsoftheformr = T
−1
i
(x − L
i−1
2
r)
, andtheba kwardsweepsoftheformw = w − T
−1
i
U
i
2
r
. Wenoti ethatweneed to solveequationsof theformT
i
y = z
orresponding toea hline blo kofthe pre onditioner,andforthisweusethefollowingfa toredformofT
T = (M + L
1
)(I + M
−1
U
1
),
andwesolve
T
i
y = z
usingtheforwardsweepsoftheforms = ˜
M
−1
i
(z − ˜
L
i−1
1
s)
, andtheba kwardsweepsoftheformy = y − ˜
M
−1
i
U
˜
1
i
s
.Algorithm1PSEUDOCODETOFIND
M
FORRNF(α
,β
) INPUT:Diag(A)
,L
1
,L
2
,L
3
,U
1
,U
2
,U
3
OUTPUT:
M
for
i
=1tonz(Numberofplanes)doif
i 6= 1
andβ 6= 0
(Nottherstplane)then SolveP
T
i−1
b
x
=(c
L
3
i−1
)
T
.1
c
M
i
=M
c
i
-β(c
U
3
i−1
)
T
b
x
endif UPDATE-FROM-LINES( c
M
i
)
endfor UPDATE-FROM-LINES(M
c
i
)for
j
=1tony(Numberoflinesin urrentplane)do ifj 6= 1
andβ 6= 0
(Nottherstline)thenSolve
T
T
j−1
x
=(L
2
j−1
)
T
.1
M
(i−1)∗ny+j
=M
(i−1)∗ny+j
-β(U
j−1
2
)
T
x
endif UPDATE-FROM-CELLS(M
(i−1)∗ny+j
)
endfor UPDATE-FROM-CELLS(M
i
)
˜
M
(i−1)∗nx+1
=D
˜
(i−1)∗nx+1
(Update fromrst ellof urrentline) fork
=2to nx(Update fromrestofthe ells)do˜
M
(i−1)∗nx+k
=M
˜
(i−1)∗nx+k
-α ˜
L
(i−1)∗nx+k−1
1
M
˜
(i−1)∗nx+k−1
−1
U
˜
1
(i−1)∗nx+k−1
endfor4 LOW FREQUENCYTANGENTIAL FIL TER-ING DECOMPOSITION
In[4℄,aLowFrequen yTangentialFilteringDe omposition(LFTFD) pre ondi-tionerwasdeveloped. If
t
isanyve torandB
LF
istheLFTFDpre onditioner, thenthepre onditionersatisestherightlteringpropertyAt = B
LF
t .
(4)Webriey des ribethede ompositionpro edure. Thepre onditioner
B
LF
isdenedas followsB
LF
= (Q + L
3
)(I + Q
−1
U
3
)
,where
Q
is ablo k diagonalmatrixwith diagonal blo ksQ
i
of sizenxy
ea h, theblo ksQ
i
s
aredened asfollowsQ
i
=
(
b
D
1
, i = 1,
b
D
i
− b
L
i−1
3
(2β
i−1
− β
i−1
Q
i−1
β
i−1
) b
U
3
i−1
, i = 2 to nz
where
β
i
isgivenbyβ
i
= Diag((Q
−1
i
U
b
i
t
i+1
)./( b
U
i
t
i+1
))
(5)Here./isapointwiseve tordivision,
Diag(v)
isthediagonalmatrix onstru ted fromtheve torv
andt = [t
1
, ..., t
nz
]
T
isthelterve tor. Divisionbyzerobeing undened,itisassumedthatno omponentof
t
iszero.Thispre onditionerseemstoberobustenoughwhen ombinedwithILU(0) usingmultipli ativepre onditionersimilar to(2) forawide lass problemsfor whi h it was tested. For more on this we refer the reader to [4℄. For other pre onditioners satisfying the ltering property (4), the reader is referred to [9,10,21, 22,23℄.
5 ANALYSISOFTHEMULTIPLICATIVE
PRE-CONDITIONERSINVOLVINGRNF(0,0),RNF(1,0), AND LFTFD
Foranymatrix
K
, letK ≻ 0
denote that thematrixK
is symmetri positive denite;letK > 0
denotethatea hentryofK
ispositive,andletλ
i
(K)
denote anarbitraryeigenvalueofK
.For onvenien e,wedenotethepre onditionersRNF(0,0)by
B
RN F
0
,RNF(1,0) byB
RN F
1
,andLFTFDbyB
LF
. WhentheresultsapplytobothRNF(0,0)and RNF(1,0),wedenote thepre onditionerasB
RN F
; further,letB
c
denoteB
−1
c
= B
−1
RN F
+ B
LF
−1
− B
−1
LF
AB
RN F
−1
.
(6) TheFollowingtheoremshowsthatthemultipli ativepre onditionerobtained usingformula(6)satisestherightlteringproperty(4).Lemma5.1 If
B
LF
satises the right ltering property (4) on ve tort
, then thepre onditionerB
c
1
obtainedby ombiningB
LF
andB
RN F
(α,β)
usingformula (6)satises theright ltering property (or onstraint)on thesame ve tor,i.e.B
c1
t = At
.
Proof: FromEqn. (6)wehave
I − B
−1
c1
A = (I − B
−1
RN F
(α,β)
A)(I − B
LF
−1
A)
Fromthis identitywehave,
(I − B
−1
c1
A)t
= 0 ( since (I − B
LF
−1
A)t = 0)
Hen e thelemma.
InLemma 5.2, we provethat the xed point iterationinvolvingRNF(0,0) pre onditioneris onvergent. Lemma5.3 anbefoundin[4℄. Inourlastresult Theorem 5.6, we prove that the xed point iteration involving
B
c1
pre ondi-tioneris onvergent. Lemma5.2 IfA ≻ 0
, thenB
RN F0
≻ 0
,B
RN F
1
≻ 0
,λ
i
(B
−1
RN F
0
A) ∈ (0, 1]
, andλ
i
(B
−1
RN F
1
A) ∈ (0, 1]
.Proof: TheRNF(0,0)pre onditionerisdenedasfollows:
B
RN F
0
=
(P
RN F0
+ L
3
)P
RN F0
−1
(P
RN F0
+ L
T
3
),
P
RN F
0
=
(T
RN F
0
+ L
2
)T
RN F
−1
0
(T
RN F
0
+ L
T
2
),
T
RN F
0
=
(M
RN F0
+ L
1
)M
RN F0
−1
(M
RN F
0
+ L
T
1
),
M
RN F
0
=
Diag(A).
It is easy to see that sin e
A ≻ 0
,M
RN F
0
= Diag(A) > 0
, soT
RN F
0
≻ 0
,P
RN F
0
≻ 0
, and hen eB
RN F0
≻ 0
. Sin eB
−1
RN F
0
A
is similar to asymmetri matrixB
−
1
2
RN F
0
AB
−
1
2
RN F
0
, all the eigenvalues ofB
−1
RN F
0
A
are real. Also, sin eB
RN F0
,T
RN F
0
, andP
RN F
0
arepositivedeniteandB
RN F0
= A + L
1
M
RN F
−1
0
L
T
1
+ L
2
T
RN F
−1
0
L
T
2
+ L
3
P
RN F0
−1
L
T
3
,
wehave
(B
RN F
0
x, x)
= (Ax, x) + (M
RN F
−1
0
L
T
1
x, L
T
1
x) + (T
RN F
−1
0
L
T
2
x, L
T
2
x) + (P
RN F
−1
0
L
T
3
x, L
T
3
x),
≥ (Ax, x),
> 0, ∀x 6= 0,
and from this we have
λ
i
(B
−1
RN F
0
A) ∈ (0, 1]
. TheB
RN F
1
pre onditioner has similarhierar hi alrepresentationasB
RN F
0
, itisdened asfollows:B
RN F
1
=
(P
RN F1
+ L
3
)P
RN F1
−1
(P
RN F1
+ L
T
3
),
P
RN F
1
=
(T
RN F
1
+ L
2
)T
RN F
−1
1
(T
RN F
1
+ L
T
2
),
Sin e
A ≻ 0
,wehaveT
RN F
1
= Diag(A) + L
1
+ L
T
1
≻ 0
,P
RN F
1
≻ 0
,andhen eB
RN F
1
≻ 0
. AlsoB
RN F
1
= A + L
2
T
RN F
−1
1
L
2
T
+ L
3
P
RN F
−1
1
L
T
3
,
(B
RN F
1
x, x)
≥ (Ax, x),
> 0, ∀x 6= 0,
andfromthiswehave
λ
i
(B
−1
RN F
1
A) ∈ (0, 1]
. Hen etheproof.Lemma5.3 Let
A ≻ 0
,thenB
LF
≻ 0
,B
LF
− A = N ≻ 0
, andλ
i
(B
−1
LF
A) ∈
(0, 1]
.Proof: Fortheproof ofthe fa t that
B
LF
≻ 0
andB
LF
− A = N ≻ 0
see Lemma2.2 in [4℄. Nowbythe similarargumentasin thepreviouslemma, we have(B
LF
x, x) = (Ax, x) + (N x, x) ≥ (Ax, x) > 0, ∀x 6= 0,
andfromthisitfollowsthat
λ
i
(B
−1
LF
A) ∈ (0, 1]
.If
A ≻ 0
, then the inner produ t( , )
A
dened by(u, v)
A
= u
T
Av
is a well dened inner produ t, and it indu es the energy norm
k k
A
dened bykvk
A
= (v, v)
A
1
2
foranyve torv
. AmatrixK
is alledA-selfadjointif(Ku, v)
A
= (u, Kv)
A
.
orequivalentlyif,
A
−1
K
T
A = K
(7) Lemma5.4 IfAissymmetri ,thenthematri es
I − B
−1
RN F
A
,I − B
−1
LF
A
,andI − B
−1
c
A
areA-selfadjoint.Proof: Using riteria for self-adjointness dened by (7), it an be easily provedthat
I −B
−1
RN F
A
andI −B
−1
LF
A
areA-selfadjoint. ToprovethatI −B
−1
c
A
isself-adjoint,wehaveA
−1
(I − B
−1
c
A)
T
A =
A
−1
(I − B
−1
LF
A)
T
(I − B
−1
RN F
A)
T
A,
=
A
−1
(I − AB
−1
RN F
− AB
LF
−1
+ AB
LF
−1
AB
−1
RN F
)A,
=
I − B
−1
c
A.
Hen ethetheorem.
Foranymatrix
K
, letρ(K) = max
i
(|λ
i
(K)|)
denote thespe tralradiusofK
.Lemma5.5 [Ashby, Holst, Manteuel, and Saylor℄ [5 ℄ If
A ≻ 0
andK
is A-selfadjoint,thenkKk
A
= ρ(K)
.Theorem5.6 If
A ≻ 0
, then the xed point iteration with the orresponding error propagationmatrix(I − B
−1
c
A)
is onvergent,i.e.,ρ(I − B
−1
c
A) < 1.
Proof: UsingTheorem(5.4)andLemma(5.5),wehave
ρ(I − B
−1
c
A) =
kI − B
c
−1
Ak
A
,
=
k(I − B
−1
RN F
A)(I − B
LF
−1
A)k
A
,
≤
k(I − B
−1
RN F
A)k
A
k(I − B
−1
LF
A)k
A
,
=
ρ(I − B
−1
RN F
A)ρ(I − B
LF
−1
A),
<
1 (U sing Lemma 5.2 and Lemma 5.3).
Hen e theproof.
6 NUMERICAL EXPERIMENTS
Inthisse tionwepresentnumeri alresultsforNF,RNF,andits ombinations with LFTFD.Weshall ompareea hof thesepre onditioners,and outlinethe advantages and disadvantages of ea h of them. First, we will ompare the onvergen eofRNF(
α
,β
)withother lassi alpre onditioners,andlaterweshall observethespe trumbehavior. Se ondly,weshall ompareseveral ombinative pre onditionings involving RNF(α
,β
). Thirdly, we shall ompare the results betweenadditiveandmultipli ativepre onditionings,andthennally weshall dis usstheresultsof ombinativepre onditioningswiththelterve tor hosen asRitzve tor.6.1 Sour e of matri es
We onsidertheboundaryvalueproblemasin [4℄
div(a(x)u) − div(κ(x)∇u) = f in Ω,
u = 0 on ∂Ω
D
,
∂u
∂n
= 0 on ∂Ω
N
,
(8) whereΩ = [0, 1]
n
(
n = 2
,or3),∂Ω
N
= ∂Ω \ ∂Ω
D
.
Theve torelda
andthe tensorκ
arethegiven oe ientsofthepartialdierentialoperator. In2D ase, wehave∂Ω
D
= [0, 1] × {0, 1}
,andin 3D ase,wehave∂Ω
D
= [0, 1] × {0, 1} ×
[0, 1]
.Thefollowingve asesare onsidered:
Case 4.1: Adve tion-diusion problem with a rotating velo ityin two dimen-sions:
Thetensor
κ
isidentity, andthe velo ityisa
= (2π(x
2
− 0.5), 2π(x
1
− 0.5))
T
. Wetestproblemsforthetwo-dimensional ase.
Case 4.2: Non-homogeneous problems with large jumps in the oe ients in twodimensions:
The oe ient
a
iszero. Thetensorκ
isisotropi anddis ontinuous. Itjumps from the onstantvalue10
3
in the ring1
2
√
2
≤ |x − c| ≤
1
2
,c = (
1
2
,
1
2
)
T
, to 1 outside. Wetestproblemsforthetwo-dimensional ase.Case 4.3: Skys raper problems:
Thetensor
κ
isisotropi anddis ontinuous. Thedomain ontainsmanyzonesof highpermeabilitywhi hareisolatedfromea hother. Let[x]
denotetheinteger valueofx
. Fortwo-dimensional ase,wedeneκ(x)
asfollows:andforthree-dimensional ase
κ(x)
is denedasfollows:κ(x) =
10
3
∗ ([10 ∗ x
2
] + 1),
if [10 ∗ x
i
] = 0 mod(2) , i = 1, 2, 3,
1,
otherwise.
Case 4.4: Conve tive skys raperproblems:
ThesamewiththeSkys raperproblemsex eptthatthevelo ityeldis hanged tobe
a
= (1000, 1000, 1000)
T
. Bothtwoandthreedimensional aseare onsid-ered.
Case 4.5: Anisotropi layers:
The domain is made of 10 anisotropi layers with jumps of up to four or-ders of magnitude, and an anisotropy ratio of up to
10
3
in ea h layer. For three-dimensional ase, the ube is divided into 10 layers parallel to
z = 0
, of size 0.1, in whi h the oe ients are onstant. The oe ientκ
x
in thei
th layer is given byv(i)
, the latter being thei
th omponent of the ve torv = [α, β, α, β, α, β, γ, α, α]
, whereα = 1
,β = 10
2
and
γ = 10
4
. We have
κ
y
= 10κ
x
andκ
z
= 1000κ
x
. Thevelo ityeldiszero.We onsiderproblemsonuniformgridwith
n×n
nodesforthetwo-dimensional ase, where we hoosen = 100, 200, 300
, and400
; whereas, for the three-dimensional ase we onsider the uniform grid withn × n × n
nodes, and we hoosen = 20, 30
,and40
.6.2 Comments on the numeri al tests
All ourtest resultswere obtainedwith GMRES(20) in double pre ision arith-meti withinitial solutionve torasve torofallzeros. Theknownsolutionis arandomve tor. Thestopping riteriais thede rease ofthe relativeresidual below
10
−12
. Themaximumnumberof iterationsallowedis
200
. Themethod of NF is implemented in Fortran 90, whereas, the method of LFTFD is im-plemented in MATLAB, for this reason insteadof CPU time, weprovidethe op ount (see Table 1) for anx × nx × nx
uniform grid forall the methods onsidered.Intable1,weintrodu eshort notationsforthemethodsand thetest ases onsidered.
To ompareRNF(1,1)andRNF(0,0)withother lassi almethods,wepresent resultsinTable(3). Wenoti ethat RNF(1,1)takessmallernumberofstepsto onverge as ompared to the lassi alpre onditioners like ILU and MILU.In mostof thetest ases, we observethat wheneverMILU onverged within
200
steps,RNF(1,1) onvergedaswell. Formatri eswithdis ontinuous oe ients like skys rapperand onve tiveskys rapperproblems, RNF(1,1),ILU(0), and MILU fail to onverge within the maximum number of iterations. The on-ditioning behaviorofRNF(0,0) seemsto be similar to ILU(0) in terms ofthe numberofstepsrequiredfor onvergen e;theproblems forwhi hILU(0) on-verges, RNF(0,0) onverges as well, and for those for whi h ILU(0) does not onverge, RNF(0,0)doesnot onvergeeither. Theseobservationssuggestthat RNF(0,0)alone anrepla eILU(0)for ertainproblems;theadvantageofdoing thisisthat forRNF(0,0),no ostof onstru tionisrequired,and thereare no storagerequirements(see Table2). On the other hand, omparing RNF(0,0) andRNF(1,0)(seetable3andtable4),wendthatRNF(1,0) onvergesfaster when omparedtoRNF(0,0). So,RNF(1,0) analsorepla eILU(0)forthetest ases onsidered.Table1: Shortnotationsforthemethodsandthetest ases onsidered. METHODS STANDS FOR
ILU In ompleteLUwithzeroll-in MILU ModiedILUwith olsum onstraint
TFILU Combinativepre onditioningwithLFTFDandILU RNF(
α
,β
) Relaxednestedfa torizationwithparametersα
andβ
. TFRNF(α
,β
) Combinativepre onditioningwithRNF(α
,β
)andLFTFD TF(r,l)RNF(1,1) Combinativepre onditioningwithRNF(1,1)andLFTFDwithlterve torasthelargestRitzve tor TF(r,s)RNF(0,0) Combinativepre onditioningwithRNF(0,0)and
LFTFDwithlterve torasthesmallestRitzve tor MATRICES STANDS FOR
2DNHm 2-dimensionalnon-homogeneousproblemofsize mbym 2DADm 2-dimensionaladve tiondiusionproblemofsize mbym. 2DSKYm 2-dimensionalskys rapperproblemofsizembym. 2DCONSKYm 2-dimensional onve tiveskys rapperofsize mbym. 3DSKYm 3-dimensionalskys rapperproblem ofsizembym bym. 3DCONSKYm 3-dimensional onve tiveskys rapperofsizembymbym. 3DANIm 3-dimensionalanisotropi problemofsizem bymbym.
Table2: Cost of onstru tion, ost of applying, and thestorage requirements forseveralmethods(nnz(K) =no. ofnon-zerosin matrixK).
Method Constru tion Toapply Storage RNF(
α
,β
)≈ 720 nx
7
≈ 18 nx
3
nx
3
RNF(1,0)≈ 4nx
3
≈ 18 nx
3
nx
3
RNF(0,0)0
≈ 18 nx
3
Null LFTFD≈
2
3
nx
7
≈
2
3
nx
7
nnz(A
)-2nnz(L
3
) ILU(0)≈ 7nx
3
≈ 14nx
3
nnz(A
)Tostudythespe trumofRNF(
α
,α
)pre onditionedmatrixfordierent val-uesoftheparameterα
,wedisplaythespe trumfor30×30
skys rapperproblem in theFigure(1). Thespe trum ofTFRNF(0,0) pre onditionedmatrixis dis-playedintheFigure(2). Asweobserveinthegure,theparameterhasinuen e onthespe trum;withα = 0
,theeigenvaluesofRNF(0,0)liesbetweenzeroand one,whi hisfavorableforthemultipli ativepre onditioningasseeninse tion 5.Forafair omparison,intables3and4,thenumberofiterations orrespond-ingto multipli ative ombinationaredoubled. Ontheotherhand,thenumber ofiterationsfortheadditive asearethea tualnumberofiterations,keepingin mindthat ea hof thepre onditionersolves anbeperformedon twodierent pro essorsonparallelar hite tures.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−1
0
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−1
0
1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
−1
0
1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
−1
0
1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
−1
0
1
0
500
1000
1500
2000
2500
−2
0
2
x 10
−12
Figure1: Spe trumplotforRNF(
α
,α
)pre onditioned2D30×
30skys rapper problem: (toptobottom)α
=0,0.2,0.4,0.6,0.8, and1Comparingthe ombinative pre onditioning TFILU and TFRNF(1,
α
), we ndthat the ombinationTFRNF(1,α
)performsmu hbetter omparedtothe ombination TFILU or other lassi al methods (see table 3 and 4) when pa-rameter is hosen lose to1
. We noti e that the signi ant number of steps an be saved for two-dimensional problems, but for three-dimensional prob-lems dependen e on the parameter is almost negligible. This dependen e of iterationnumberontheparameterforea htypeofmatri esareplottedin Fig-ure (4) for two-dimensional ase and in Figure (5) for three-dimensional ase0
1
2
3
4
5
6
7
x 10
4
−50
0
50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−0.05
0
0.05
0
0.2
0.4
0.6
0.8
1
1.2
1.4
−0.02
0
0.02
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
−0.02
0
0.02
Figure 2: Spe trum plot for 2D 60
×
60 skys rapper problem: (top to bot-tom) Spe trum of A followed by spe trum of RNF(0,0), LFTFD, and RNF mult.LFTFD pre onditionedmatrixforRNF(
1
,α
),RNF(α
,α
), forbothadditiveandmultipli ative ombinationsof TFRNF(1,α
),andforadditiveandmultipli ative ombinationofTFRNF(α
,α
). In these plots, the a tualnumber of iterationsare presented. From these g-uresweobservethat for three-dimensionalproblems,the ombinations involv-ingRNF(0,0)andRNF(1,0)remaingoodparameterindependent hoi esasnot mu hisgainedwithanyvalueofparameters. Betweenthe ombinations involv-ing RNF(0,0) and RNF(1,0), we observe that TFRNF(1,0) performs slightly betterthanTFRNF(0,0). OntheotherhandinTFRNF(0,0),RNF(0,0)hasno ostof onstru tionandhasnostoragerequirement.As mentioned before, there are two hoi es of ombinative pre ondition-ing, rst is amultipli ativeapproa h, and se ond is an additiveapproa h. In termsoftotalnumberofiterations,themultipli ativeapproa hperformsbetter ompared to additive approa h. On parallel ar hite ture, ea h pre onditioner solvefortheadditive ase anbedoneontwodierentpro essors,whi hisan advantageoverthemultipli ative ase.
InFigure(3),wepresent onvergen e urvesforRNF(1,1),TFILU,TFRNF(0,0), ILU(0), and MILUforonematrixea hfrom the test ases. Fromthese plots, weobserve that the onvergen e behavior of the ombination TFRNF(0,0) is
idewithea hother;thereasonforsu hsimilaritymaybeduetothefa tthat RNF(0,0)in somesense behaveasILU(0) foralltheproblems onsidered(see Table3).
Othermeaningfulattemptinvolves ombiningRNF(1,1)withLFTFDwhere thelterve tor hosenisthelargestRitzve tor. Inourtest asesweobserve that the reasonfor thefailure of RNF(1,1) is thepresen e of someverylarge eigenvaluesin thespe trum of the pre onditioned matrix, see Figure (1). To deate one of these largest eigenvalues, we hose the lter ve tor to be the largestRitzve torobtainedafter20stepsofArnoldiiteration[6℄withRNF(1,1) pre onditioned matrix. We observethat in Table (4), where the total ost is shown, this ombination is not robust enough, the reason beingthe presen e ofmany otherlarge eigenvaluesin theRNF(1,1)pre onditioned matrix whi h ause di ulties in the pre onditioned GMRES method. On the other hand RNF(0,0)pre onditionedmatrixhasfewsmalleigenvalues,seeFigure(2). For thisreasonwe hoosethelterve tor orrespondingto thesmallestRitzvalue inmagnitude. This ombination,i.e.,TF(r,s)RNF(0,0)isnotrobusteither,and thereasonbeingthepresen eofothersmalleigenvaluesinthespe trumofthe pre onditionedmatrix.
However, in our tests we found that for the ombinative pre onditioning, the hoi eofthelterve torasve torofallonesis quiterobust. Moreover,it eliminatesthe ostofformingthelterve tor.
0
50
100
150
200
250
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 2dNH200200
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
(a) 2D200
× 200
non-homogeneous problem0
50
100
150
200
250
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 2dAD200200
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
(b) 2D200
× 200
adve tion diusion problem0
50
100
150
200
250
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 2dCONSKY200200
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
( ) 2D200
×200
onve tiveskys rapper problem0
50
100
150
200
250
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 3dSKY303030
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
(d) 3D30
×30×30
skys rapperproblem0
50
100
150
200
250
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 3dCONSKY303030
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
(e) 3D
30
×30×30
onve tive skys rap-perproblem0
10
20
30
40
50
60
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Convergence curves for the matrix 3dANI303030
Number of iterations
Norm of relative residual
RNF(1,1)
ILU(0)+LFTFD
RNF(0,0)+LFTFD
ILU(0)
MILU
(f) 3D30
× 30× 30
anisotropi problem0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of alpha Vs no. of iters. for 2dNH200200
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
(a) 2D200
× 200
non-homogeneous problem.0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of Alpha Vs No. of iters. for 2dNH200200
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(b) 2D200
× 200
non-homogeneous problem.0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of itertions
Plot of alpha Vs no. of iters. for 2dAD200200
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
( ) 2D200
× 200
adve tion-diusion problem.0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of Alpha Vs No. of iters. for 2dAD200200
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(d) 2D200
× 200
adve tion-diusion problem.0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of alpha Vs no. of iters. for 2dSKY200200
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
(e) 2D
200
× 200
skys rapperproblem.0
0.2
0.4
0.6
0.8
1
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of alpha Vs no. of iters. for 2dSKY200200
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(f) 2D
200
×200
skys rapperproblem.0
0.2
0.4
0.6
0.8
1
0
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dSKY202020
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
(a)3D
20
×20×20
skys rapperproblem.0
0.2
0.4
0.6
0.8
1
0
20
40
60
80
100
120
140
160
180
200
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dSKY202020
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(b)3D20
× 20 × 20
skys rapper prob-lem.0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dANI202020
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
( )3D20
× 20× 20
anisotropi problem.0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dANI202020
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(d)3D20
× 20 × 20
anisotropi prob-lem.0
0.2
0.4
0.6
0.8
1
10
20
30
40
50
60
70
80
90
100
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dCONSKY202020
RNF(1,alpha) mult. LFTFD
RNF(1,alpha) add. LFTFD
RNF(1,alpha)
(e)3D
20
× 20 × 20
onve tive skys rap-perproblem.0
0.2
0.4
0.6
0.8
1
0
20
40
60
80
100
120
alpha
number of iterations
Plot of alpha Vs no. of iters. for 3dCONSKY202020
RNF(alpha,alpha) mult. LFTFD
RNF(alpha,alpha) add. LFTFD
RNF(alpha,alpha)
(f)3D
20
×20×20
onve tive skys rap-perproblem.Pawan Kumar , L aur a Grigori , F r e deri Nataf , Qiang Niu sol.)
Meth./Mat. ILU MILU TFILU RNF(0,0) TFRNF(0,0) RNF(1,1)
Mult. Add. Mult. Add.
2dNH100 - 72(e-12) 56(e-11) 39(e-11) - 52(e-11) 38(e-11) 37(e-11) 2dNH200 - 109(e-11) 78(e-10) 53(e-10) - 72(e-11) 53(e-11) 54(e-11) 2dNH300 - 141(e-11) 96(e-10) 66(e-10) - 88(e-10) 65(e-10) 67(e-10) 2dNH400 - 169(e-11) 110(e-10) 75(e-10) - 104(e-10) 74(e-10) 78(e-10) 2dAD100 - 65(e-11) 56(e-11) 39(e-11) - 52(e-11) 38(e-11) 36(e-11) 2dAD200 - 95(e-11) 80(e-10) 54(e-10) - 74(e-10) 53(e-10) 53(e-11) 2dAD300 - 121(e-11) 96(e-10) 67(e-10) - 88(e-10) 65(e-10) 67(e-11) 2dAD400 - 146(e-11) 110(e-10) 75(e-10) - 104(e-10) 74(e-10) 79(e-11) 2dS100 - - 56(e-08) 36(e-08) - 50(e-07) 37(e-08) -2dS200 - - 82(e-07) 54(e-07) - 74(e-07) 55(e-07)
-2dS300 - - 100(e-06) 68(e-07) - 96(e-06) 69(e-06) 110(e-09) 2dS400 - - 124(e-06) 82(e-06) - 116(e-06) 86(e-06)
-2dCS100 - - 50(e-09) 39(e-09) - 50(e-09) 41(e-09) -2dCS200 - - 66(e-08) 59(e-09) - 76(e-08) 67(e-08) -2dCS300 - 92(e-10) 78(e-08) 73(e-08) - 98(e-08) 80(e-08) 63(e-08) 2dCS400 - - 120(e-08) 101(e-07) - 138(e-08) 120(e-08) -3dCS15 113(e-11) 99(e-12) 22(e-11) 28(e-11) 92(e-10) 34(e-11) 30(e-10) 47(e-11) 3dCS20 88(e-10) 77(e-11) 20(e-10) 23(e-10) 122(e-10) 32(e-10) 29(e-10) 15(e-11) 3dCS30 169(e-09) - 32(e-10) 27(e-10) 162(e-10) 32(e-10) 29(e-10) 138(e-10) 3dCS40 169(e-09) - 28(e-10) 25(e-10) 191(e-09) 34(e-10) 30(e-10) 24(e-10) 3dS20 196(e-07) - 24(e-09) 20(e-09) - 24(e-09) 20(e-09) 16(e-09) 3dS30 - - 30(e-08) 22(e-08) - 26(e-08) 22(e-08) -3dS40 - - 32(e-08) 23(e-08) - 28(e-08) 23(e-08) 26(e-08) 3dANI20 29(e-08) 43(e-08) 22(e-09) 15(e-08) 72(e-08) 26(e-08) 21(e-08) 20(e-08) 3dANI30 51(e-07) 58(e-08) 24(e-08) 17(e-08) 104(e-07) 28(e-08) 23(e-08) 23(e-08) 3dANI40 55(e-07) 72(e-08) 26(e-08) 18(e-07) 132(e-07) 30(e-08) 23(e-08) 23(e-08)
Combinative pr e onditioning b ase d on RNF and T angential Filtering Meth./Mat. RNF(1,0) TFRNF(1,0) TFRNF(.9,.9) TFRNF(1,0.9) TF(r,l)RNF(1,1) TF(r,s)RNF(0,0)
Mult. Add. Mult. Add. Mult. Add. Mult. Mult.
2dNH100 168(e-10) 48(e-11) 35(e-11) 36(e-11) 24(e-11) 34(e-11) 21(e-10) 52(e-11) 52(e-11) 2dNH200 - 70(e-11) 48(e-10) 50(e-10) 34(e-11) 48(e-10) 29(e-10) 78(e-11) -2dNH300 - 84(e-10) 59(e-10) 62(e-10) 40(e-10) 58(e-10) 36(e-10) 104(e-10) -2dNH400 - 96(e-10) 69(e-10) 72(e-10) 46(e-10) 70(e-10) 42(e-10) 122(e-10) -2dAD100 168(e-10) 48(e-11) 35(e-11) 36(e-11) 24(e-11) 32(e-11) 21(e-10) 52(e-11) 52(e-11) 2dAD200 - 70(e-11) 49(e-10) 50(e-10) 33(e-10) 48(e-10) 30(e-10) 78(e-11) 394(e-09) 2dAD300 - 84(e-10) 59(e-10) 64(e-10) 40(e-10) 60(e-10) 37(e-10) 100(e-10)
-2dAD400 - 96(e-10) 67(e-10) 70(e-10) 45(e-10) 70(e-10) 42(e-10) 128(e-10) -2dS100 - 44(e-07) 33(e-08) 42(e-07) 22(e-07) 42(e-07) 23(e-07) - -2dS200 - 68(e-07) 51(e-07) 64(e-07) 32(e-07) 60(e-07) 30(e-07) - -2dS300 - 84(e-06) 60(e-07) 58(e-06) 38(e-06) 56(e-07) 34(e-06) 124(e-09) -2dS400 - 102(e-06) 72(e-06) 94(e-06) 47(e-06) 92(e-06) 46(e-06) -
-2dCS100 - 46(e-09) 39(e-09) 38(e-09) 31(e-09) 38(e-09) 32(e-09) - 390(e-09) 2dCS200 - 70(e-08) 57(e-08) 52(e-09) 39(e-08) 52(e-08) 39(e-09) -
-2dCS300 - 80(e-08) 73(e-08) 56(e-08) 43(e-08) 48(e-08) 39(e-08) 108(e-09) -2dCS400 - 124(e-07) 109(e-07) 74(e-08) 59(e-08) 72(e-08) 58(e-08) -
-3dCS15 92(e-10) 30(e-11) 28(e-10) 44(e-10) 36(e-11) 48(e-11) 37(e-11) 76(e-11) 154(e-10) 3dCS20 97(e-10) 30(e-11) 27(e-10) 28(e-10) 27(e-11) 28(e-10) 26(e-11) 30(e-11) 302(e-10) 3dCS30 168(e-09) 28(e-10) 26(e-10) 44(e-10) 29(e-10) 44(e-10) 29(e-10) - 120(e-09) 3dCS40 - 28(e-10) 27(e-10) 26(e-10) 24(e-09) 24(e-09) 24(e-10) 78(e-10) 170(e-09) 3dS20 - 24(e-09) 20(e-09) 26(e-09) 20(e-09) 26(e-09) 20(e-09) 28(e-09) 278(e-08) 3dS30 - 26(e-08) 22(e-08) 50(e-09) 25(e-09) 52(e-09) 26(e-08) - 188(e-08) 3dS40 - 28(e-08) 22(e-08) 30(e-08) 21(e-08) 32(e-09) 21(e-08) 46(e-09)
-3dANI20 57(e-08) 26(e-09) 19(e-08) 18(e-08) 14(e-08) 18(e-08) 13(e-08) 28(e-09) 84(e-09) 3dANI30 75(e-07) 28(e-08) 21(e-08) 20(e-08) 15(e-08) 20(e-08) 14(e-08) 30(e-08) 72(e-06) 3dANI40 106(e-07) 28(e-08) 22(e-08) 20(e-07) 15(e-07) 20(e-08) 14(e-08) 32(e-08) 84(e-08)
RR
n
°
7 CONCLUSIONANDFUTUREDIRECTIONS Wehaveintrodu edarelaxedversionofNF,andanee tive ombinative pre- onditioningisa hievedin ombinationwithtangentiallteringpre onditioners [4℄. Compared to ILU(0), it hasseveral important properties. If
A
is S.P.D. thenRNF(0,0)is S.P.D.and leadstoastable splitting. There is nosetup ost nor storage requirement. The new pre onditioner is parti ularly useful when multipli ate/additivepre onditioningis a hievedin ombinationwith tangen-tial lteringpre onditioner [4℄. With this setup the newpre onditioner is as robustasthe ombinationofILU(0)withtangentiallteringpre onditioner.A futurework onsistsin studyingothermodiedversionofNF, similarto onesuggestedbyGustason[13℄,[14℄forMILU,wherethediagonalisperturbed byanadditivetermoforder
O(h
2
)
,whi hseemstoimprovethe onditioningof modiedin ompletefa torizationmethodsfor ertain lassofproblems.
Referen es
[1℄ J. R. Appleyard and I. M. Cheshire, Nested Fa torization, SPE 12264, presented at the Seventh SPE Symposium on Reservoir Simulation, San Fran is o,1983.
[2℄ J. R. Appleyard, Proof that simple Colsum modied ILU fa toriza-tion of matri es arising in Fluid folw problems are not singular, by John Appleyard, Polyhedron Software Ltd, May 2004. Available online : http://www.polyhedron. om/ olsum-proof.
[3℄ J.R.AppleyardandI.M.Cheshire,Stri tlyTriangular L.Uformulationof NestedFa torization, by JohnAppleyard, PolyhedronSoftwareLtd, May 2004.Availableonline: http://www.polyhedron. om/stri tlu.
[4℄ Y. A hdouandF.Nataf,Low frequen y tangentialltering de omposition, Numer.LinearAlgebraAppl.,14,(2007),pp.129-147.
[5℄ S. F. Ashby, M. J. Holst,T. A. Manteuel, and P. E. SaylorThe role of inner produ t in stopping riteria for onjugate gradient iterations, BIT., 41, (2001),pp.26-52.
[6℄ Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst Tem-platesfortheSolutionofAlgebrai EigenvalueProblems: APra ti alGuide ,SIAM,Philadelphia,2000.
[7℄ M.Benzi,A.M.Tuma,A omparativestudyofsparseapproximateinverse pre onditioners,Appl.Numer.Math.,30,(1999),pp.305-340.
[8℄ D.BraessTowardsalgebrai multigridfor ellipti problems ofse ondorder, Computing,55(4), (1995),pp.379-393.
[9℄ A.Buzdin,Tangentialde omposition,Computing.,61,(1998),pp.257-276. [10℄ A. Buzdin and G. Wittum, Two-frequen y de omposition, Numer. Math.,
[11℄ C.Wagner andG. Wittum,Adaptive ltering,Numer.Math.,78, (2004), pp.305-328.
[12℄ P. A. Crumpton, P. A. Fjerstad, and J. Berge, Parallel omputing Using ECLIPSE Parallel S hlumberger, E lipse, Te hni al Des ription 2003A, Chapter38: ParallelOption.2003.
[13℄ GustafssonI,A lass ofrstorderfa torizationmethods,BIT18,pp. 142-56,1978.
[14℄ GustafssonI,Modiedin omplete holesky(MIC)methods,InD.Evans, ed-itor,Pre onditioningMethods-TheoryandAppli ations,pp.265-293. Gor-donandBrea h,NewYork,1983.
[15℄ N. N. Kuznetsova,O. V. Diyankov, S. S. Kotegov, S. V. Koshelev, I.V. Krasnogorov,V.Y.Pravilnikov,andS.Y.Maliassov,Thefamilyof nested fa torizations, Russian Journal of Numeri al Analysis and Mathemati al Modelling.22(4)(2007),pp.393â412.
[16℄ G. Meurant Computer Solution of Large Linear Systems North-Holland PublishingCo. Amsterdam,1999.
[17℄ J. W. Ruge, K. Stüben Algebrai multigrid. Multigrid Methods, Frontiers ofAppliedMathemati s,vol.3.SIAMPhiladelphia,PA, 1987;73â130. [18℄ Y. Saad, Iterative Methods for Sparse Linear Systems, PWS publishing
ompany,Boston,MA,1996.
[19℄ K. Stüben, A review of algebrai multigrid, J. Comput. Appl. Math. and applied mathemati s, 128(1-2), (2001), pp. 281-309, Numeri al analysis 2000,vol.VII,Partialdierentialequations.
[20℄ U. Trottenberg, C. W. Oosterlee, and A. S hüller, Multigrid, A ademi PressIn .SanDiego,CA,(2001).With ontributionsbyBrandtA,Oswald P,andStübenK.
[21℄ C. Wagner, Tangential frequen y ltering de ompositions for symmetri matri es,Numer.Math.,78,(1997),pp.119-142.
[22℄ C.Wagner,Tangentialfrequen yltering de ompositions for unsymmetri matri es Numer.Math.,78,(1997),pp.143-163.
[23℄ C.Wagner andG. Wittum,Adaptive ltering,Numer.Math.,78, (1997), pp.305-382.
[24℄ P. Van¥k, M. Brezina, and J. Mandel Convergen e of algebrai multigrid basedonsmoothedaggregation,Numeris heMathematik,88(3),(2001),pp. 559-579.
[25℄ L.Grigori,F.Nataf,andQ.Niu,Twosides tangentialltering de omposi-tion,INRIATR6554.