• Aucun résultat trouvé

Comparison of solvers performance when solving the 3D Helmholtz elastic wave equations over the Hybridizable Discontinuous Galerkin method

N/A
N/A
Protected

Academic year: 2021

Partager "Comparison of solvers performance when solving the 3D Helmholtz elastic wave equations over the Hybridizable Discontinuous Galerkin method"

Copied!
33
0
0

Texte intégral

(1)

HAL Id: hal-01400663

https://hal.inria.fr/hal-01400663

Submitted on 22 Nov 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Comparison of solvers performance when solving the 3D

Helmholtz elastic wave equations over the Hybridizable

Discontinuous Galerkin method

Marie Bonnasse-Gahot, Henri Calandra, Julien Diaz, Stephane Lanteri

To cite this version:

Marie Bonnasse-Gahot, Henri Calandra, Julien Diaz, Stephane Lanteri. Comparison of solvers per-formance when solving the 3D Helmholtz elastic wave equations over the Hybridizable Discontinuous Galerkin method. MATHIAS – TOTAL Symposium on Mathematics, Oct 2016, Paris, France. �hal-01400663�

(2)

Comparison of solvers performance when solving

the 3D Helmholtz elastic wave equations over

the Hybridizable Discontinuous Galerkin method

M. Bonnasse-Gahot1,2, H. Calandra3, J. Diaz1 and S. Lanteri2

1INRIA Bordeaux-Sud-Ouest, team-project Magique 3D 2INRIA Sophia-Antipolis-Méditerranée, team-project Nachos 3TOTAL Exploration-Production

(3)

Motivations

Imaging methods

I Full Wave Inversion (FWI) : inversion process requiring to

solve many forward problems

Seismic imaging : time-domain or harmonic-domain ?

I Time-domain : imaging condition complicatedbutquite low computational cost

I Harmonic-domain : imaging condition simplebuthuge computational cost

Memory usage

(4)

Motivations

Imaging methods

I Full Wave Inversion (FWI) : inversion process requiring to

solve many forward problems

Seismic imaging : time-domain or harmonic-domain ?

I Time-domain : imaging condition complicatedbutquite low computational cost

I Harmonic-domain : imaging condition simplebuthuge computational cost

(5)

Motivations

Imaging methods

I Full Wave Inversion (FWI) : inversion process requiring to

solve many forward problems

Seismic imaging : time-domain or harmonic-domain ?

I Time-domain : imaging condition complicatedbutquite low computational cost

I Harmonic-domain : imaging condition simplebuthuge computational cost

Memory usage

(6)

Motivations

Resolution of the forward problem of the inversion process

I Elastic wave propagation in the frequency domain : Helmholtz

equation

First order formulation of Helmholtz wave equations

x = (x , y , z) ∈ Ω ⊂ R3, ( i ωρ(x)v(x) = ∇·σ(x) +fs(x) i ωσ(x) =C(x) ε(v(x)) I v : velocity vector I σ : stress tensor I ε : strain tensor

(7)

Motivations

Resolution of the forward problem of the inversion process

I Elastic wave propagation in the frequency domain : Helmholtz

equation

First order formulation of Helmholtz wave equations

x = (x , y , z) ∈ Ω ⊂ R3, ( i ωρ(x)v(x) = ∇·σ(x) +fs(x) i ωσ(x) =C(x) ε(v(x)) I v : velocity vector I σ : stress tensor I ε : strain tensor

(8)

Approximation methods

Discontinuous Galerkin Methods

3unstructured tetrahedral meshes

3combination between FEM and finite volume method (FVM)

3hp-adaptivity

3easily parallelizable method

7 7large number of DOF as compared to classical FEM

b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b bb b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b

(9)

Approximation methods

Discontinuous Galerkin Methods

3unstructured tetrahedral meshes

3combination between FEM and finite volume method (FVM)

3hp-adaptivity

3easily parallelizable method

7 7large number of DOF as compared to classical FEM

b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b bb b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b

(10)

Approximation methods

Discontinuous Galerkin Methods

3unstructured tetrahedral meshes

3combination between FEM and finite volume method (FVM)

3hp-adaptivity

3easily parallelizable method

7 7large number of DOF as compared to classical FEM

b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b bb b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b

(11)

Approximation methods

Hybridizable Discontinuous Galerkin Methods

3same advantages as DG methods : unstructured tetrahedral

meshes, hp-adaptivity, easily parallelizable method, discontinuous basis functions

3introduction of a new variable defined only on the interfaces

3lower number of coupled DOF than classical DG methods

b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b bb b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b b b bb b b b b b b b b b b b b b b b b b b b b

(12)

Approximation methods

Hybridizable Discontinuous Galerkin Methods

3same advantages as DG methods : unstructured tetrahedral

meshes, hp-adaptivity, easily parallelizable method, discontinuous basis functions

3introduction of a new variable defined only on the interfaces

3lower number of coupled DOF than classical DG methods

b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b bb b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b bb b b b b b b b bb b b b b b b bb b b b b b b b b b b b b b b b b b b b b

(13)

Principles of the HDG method

1. Introduction of a Lagrange multiplier λ

2. Expressing the initial unknowns WK =vK, σKT as a

function of λ=  λF1, λF2, ..., λFnf T WK =KKλ 3. Transmission condition : X K BKWK+LKλ= 0 4. Mλ=S

5. Computation of the solutions of the initial problem, element

by element

(14)

Principles of the HDG method

1. Introduction of a Lagrange multiplier λ

2. Expressing the initial unknowns WK =vK, σKT as a

function of λ=  λF1, λF2, ..., λFnf T WK =KKλ 3. Transmission condition : X K BKWK+LKλ= 0 4. Mλ=S

5. Computation of the solutions of the initial problem, element

(15)

Principles of the HDG method

1. Introduction of a Lagrange multiplier λ

2. Expressing the initial unknowns WK =vK, σKT as a

function of λ=  λF1, λF2, ..., λFnf T WK =KKλ 3. Transmission condition : X K BKWK +LKλ= 0 4. Mλ=S

5. Computation of the solutions of the initial problem, element

by element

(16)

Principles of the HDG method

1. Introduction of a Lagrange multiplier λ

2. Expressing the initial unknowns WK =vK, σKT as a

function of λ=  λF1, λF2, ..., λFnf T WK =KKλ 3. Transmission condition : X K BKWK +LKλ= 0 4. Mλ=S

5. Computation of the solutions of the initial problem, element

(17)

Principles of the HDG method

1. Introduction of a Lagrange multiplier λ

2. Expressing the initial unknowns WK =vK, σKT as a

function of λ=  λF1, λF2, ..., λFnf T WK =KKλ 3. Transmission condition : X K BKWK +LKλ= 0 4. Mλ=S

5. Computation of the solutions of the initial problem, element

by element

(18)

2D Numerical results : performances comparison of the HDG method

Contents

2D Numerical results : performances comparison of the HDG method

(19)

2D Numerical results : performances comparison of the HDG method

Anisotropic test case

I Three meshes :

I 600 elements

I 3000 elements

I 28000 elements

(20)

2D Numerical results : performances comparison of the HDG method

Anisotropic case : Memory consumption

M1 M2 M3 0 2,000 4,000 Mesh Memo ry (MB) P3 interpolation order HDGm IPDGm

(21)

2D Numerical results : performances comparison of the HDG method

Anisotropic case : CPU time (s)

M1 M2 M3 0 200 400 600 Mesh CPU time (s) P3interpolation order HDGm const. 1 HDGm res. 1 M1 M2 M3 0 200 400 600 IPDGm const. IPDGm res.

IPDGCPUtime'9× HDGCPUtime

(22)

3D numerical results

Contents

2D Numerical results : performances comparison of the HDG method

3D numerical results : focus on the linear solver

3D plane wave in an homogeneous medium 3D geophysic test-case : Epati test-case

(23)

3D numerical results

Resolution of the linear system

M

λ

=

S

I direct solver : MUMPS (MUltifrontal Massively Parallel

sparse direct Solver) :

I Direct factorization A = LU or A = LDLT

I Multiple RHS

I hybrid solver : MaPhys (Massively Parallel Hybrid Solver) :

I combination of direct and iterative methods

I non-overlapping algebraic domain decomposition method (Schur complement method)

(24)

3D numerical results

Cluster configuration

Features of the nodes :

I 2 Dodeca-core Haswell Intel Xeon E5-2680

I Frequency : 2,5 GHz

I RAM : 128 Go

I Storage : 500 Go

I Infiniband QDR TrueScale : 40Gb/s

(25)

3D numerical results 3D plane wave in an homogeneous medium

3D plane wave in an homogeneous medium

1000 m 1000 m 1000 m Configuration of the computational domain Ω . I Physical parameters : I ρ= 1 kg.m−3 I λ= 16 GPa I µ= 8 GPa I Plane wave : u = ∇ei (kxx +kyy +kzz) I ω = 2πf , f = 8 Hz I Mesh : 21 000 elements

(26)

3D numerical results 3D plane wave in an homogeneous medium

3D Plane wave : Memory consumption

2 4 8 16 10 100 # nodes Memo ry (GB)

Average memory for one node (8 MPI by node and 3 threads by MPI)

MaPhys Slope = 2.3

MUMPS Slope = 1.8

(27)

3D numerical results 3D plane wave in an homogeneous medium

3D Plane wave : Execution time

48 96 192 384 125 160 # cores Time (s)

Execution time for the resolution of the HDG-P3 system

MaPhys 8 MPI, 3 threads

Slope = 0.8

(matrix order = 1 300 000, # nz=300 000 000 )

(28)

3D numerical results 3D plane wave in an homogeneous medium

3D Plane wave : Execution time

48 96 192 384 125 160 # cores Time (s)

Execution time for the resolution of the HDG-P3 system

MUMPS 8 MPI, 3 threads

(29)

3D numerical results 3D geophysic test-case : Epati test-case

Epati test-case

Vp-velocity model (m.s−1), vertical section at y = 700 m

Mesh composed of 25 000 tetrahedrons

(30)

3D numerical results 3D geophysic test-case : Epati test-case

Epati test-case : Memory consumption

4 8 16 10 30 # nodes Memo ry (GB)

Average memory for one node (8 MPI by node and 3 threads by MPI)

MaPhys Slope = 2.3

MUMPS Slope = 1.8

(31)

3D numerical results 3D geophysic test-case : Epati test-case

Epati test-case : Execution time

96 192 384 10 100 # cores Time (s)

Execution time for the resolution of the HDG-P3 system

MaPhys 8 MPI, 3 threads Slope=1.3 MUMPS 8 MPI, 3 threads Slope=0.5 (matrix order = 1 300 000, # nz=365 000 000 )

(32)

Conclusions-Perspectives

Conclusion-Perspectives

I more detailed analysis of the comparison between solvers

I larger meshes

I more powerful clusters

I memory crash test

(33)

Conclusions-Perspectives

Références

Documents relatifs

Second, while currently PV is only used at the guest OS level to implement optimized drivers for virtualized environments, XPV extends this principle to the SRLs: we propose to

Par exemple, pour un dirigeant de la société A est nommé un adjoint de la société B et inversement, ce qui a pour conséquence d’alourdir l’organisation

(Slide 8) Donc, ce travail a été réalisé ou accompli au sein de ce même laboratoire LSTPA, laboratoire de recherche en sciences techniques et production animale, dans un

Keywords: externalities, speculative bubbles, heterogeneous beliefs, overconfidence, speculative bubble burst, equilibrium with

Il est possible que l’amélio ration de la compétitivité hors prix avant 2008 ait résulté essentiellement d’un eff et de sélection : compte tenu de la dégradation de

Dans le paragraphe suivant, nous pr´ esentons la formulation de l’expression g´ en´ erale de l’´ energie d’interaction entre deux aimants en fonction du potentiel scalaire

locally the MP value is equal to the electronic density (local perfect Majorana character). These conditions need to both be satisfied in order that the sum of the MP vector over

Using the definition of metropolitan statistical areas, also provided by the Census Bureau, one finds that CO 2 emissions per capita decrease slightly with population size,