A Bayesian inversion approach to recovering material parameters in hyperelastic solids using dolfin-adjoint

(1)

A Bayesian inversion approach to

recovering material parameters in

hyperelastic solids using dolfin-

adjoint

Jack S. Hale, Patrick E. Farrell, Stéphane P. A. Bordas

Hmsqnctbshnm Sgdaq`mc Sxonfq`ogx

Bnkntq

Ok`bhmfsgdaq`mc Rs`shnmdqx NsgdqhcdmshÜdqr Dwghahshnmr. Rhfm`fd OnvdqOnhms Sq`cdl`qj.Vda Bdqdlnmh`k Dw`lokdr Sdbgmhb`k hmenql`shnm

‹

aq`mcbnkntqr`mcbnkntqo`kdssd

›Sgdaq`mcbnkntqÎNwenqcaktd

The University is possibly unique in having a colour with which it is associated throughout the world:Nwenqcaktd.

Nwenqcaktd(or PANTONE © 282) is the main brand colour. Please refer to page 21 for more detailed and technical information on the use of Oxford blue.

›Sgdaq`mcl`qjrÎak`bjnmkx

In print media where only black is available (such as press

advertisement or black and white laser printing) it is acceptable to use the brand in black. However, it is important that the black artwork versions are used, as use of the Oxford blue artwork versions could result in a half-tone grey being produced.

Okd`rdmnsd9 do not print letterheads in colour on a laser printer or inkjet as the colour of the brand will not be right. Please print in black using the black artwork versions.

›Bnkntqo`kdssd

There is also a palette of preferred colours that have been selected to compliment the Nwenqcaktd. These colours are for use in

graphic elements within designs such as backdrops, graphic shapes

For more technical information on Oxford blue please refer to page 20

For more technical information on the colour palette please refer to page 21

Use the black artwork versions only when no colours are available

The main brand colour is

Oxford Blue

(or Pantone^©282)

1

(2)

Overview

• Why?

• Bayesian approach to inversion.

• Relate classical optimisation techniques to the Bayesian inversion approach.

• Example problem: sparse surface observations of a solid block.

• Use dolfin-adjoint and petsc4py to solve the problem.

(3)

Why?

(4)

(5)

(6)

(7)

(8)

Q: What can we infer about the parameters inside the domain, just from displacement observations on the

outside?

Q: Which parameters am I most uncertain about?

(9)

(10)

Bayesian Approach

• Deterministic event - totally predictable.

• Random event - unpredictable.

• Bayesian approach to inverse problems:

• The world is unpredictable.

• Consider everything as a random variable.

(11)

Terminology

• Observation. Displacements.

• Parameter. Material property.

• Parameter-to-observable map. Finite deformation hyperelasticity.

y

x

f (x )

(12)

Bayes Theorem

posterior(x | y) ^likelihood(y | x) ^prior(x)

Goal: Given the observations, find the posterior distribution of the unknown parameters.

(13)

Three step plan

1. Construct the prior.!

2. Construct the likelihood.

3. Calculate/explore the posterior.

(14)

Constructing a prior

(with DOLFIN)

Must reflect our subjective belief about the unknown parameter.

Difficulty:!

How to transfer qualitative information to quantitative.

(15)

!

Simple example involving a PDE solve: Smoothing Prior

https://bitbucket.org/snippets/jackhale/rk6xA

!

(16)

Reminder…

Let x₀ Rⁿ and R^{n n} be a symmetric positive definite matrix. A multivariate Gaussian random variable X with mean x₀ and covariance is a random variable with the probability density:

(x ) = 1

2 | | exp 1

2(x x₀)^T ¹(x x₀) . When X follows a multivariate Gaussian, we use the following notation:

X N (x0, ).

(17)

Qualitative: I think my parameter is smooth and is probably around zero at the boundary.

(18)

Imagine a parameter related to a physical quantity in 1-dimensional space. Often, the value of parameter at a point is related to the value of the parameters next to it.

X _i = 1

2 (X _i ₋₁ + X _i ₊₁ ) + W _j

With:

W = N (0, γ ² I)

AX = W

(19)

mesh = UnitIntervalMesh(160)

V = FunctionSpace(mesh, “CG”, 1) u = TrialFunction(V)

v = TestFunction(V)

...a = (1.0/2.0)*h*inner(grad(u), grad(v))*dx class W(Expression):

def eval(self, value, x):

value[0] = np.random.normal() ...

W = interpolate(W(), V) A = assemble(A)

...

(20)

Boundary conditions

1. Dirichlet: set to zero.

2. Extend definition of Laplacian outside domain.

(21)

values = np.array([1.0, -0.5], dtype=np.float_) rows = np.array([0], dtype=np.uintp)

cols = np.array([0, 1], \

dtype=np.uintp) A.setrow(0, cols, values) cols = np.array([V.dim() - 1, V.dim() - 2], \

dtype=np.uintp)

A.setrow(V.dim() - 1, cols, values) A.apply("insert")

(22)

!

(23)

! 2 −1 −T

(24)

(25)

Exploring the posterior

(26)

x _MAP = arg max

x ∈R

ⁿ

π _posterior (x | y)

x _MAP

x

cov(x | y)

(27)

x _CM =

!

R

ⁿ

x π _posterior (x | y) dx

x _MAP

x _CM

cov(x | y)

(28)

cov(x | y) =

!

Rⁿ(x − x^cm)(x − x^cm)^T π_posterior(x | y) dx ∈ Rⁿ^×n

x _MAP

x

cov(x | y)

(29)

OK, but how can we use

dolfin-adjoint to do this?

Aim: Connect Bayesian approach to classical optimisation techniques.

(30)

x _MAP = arg max

x ∈R

ⁿ

π _posterior (x | y)

x

_MAP

cov(x | y)

(31)

Assumptions

1. I think my parameter is Gaussian (prior).

2. My parameter to observable map is linear and my noise model is Gaussian.

X ∼ N (x 0 , Γ _prior ), X ∈ R ⁿ

Y = AX + E, Y ∈ R ^m , A ∈ R ^m ^×n

E ∼ N (0, Γ ), Y ∈ R ^m

(32)

Plug it in…

π_posterior(x|y) ∝ exp

!

−1

2(y − Ax)^T Γ⁻¹_noise(y − Ax)

"

×exp

!

−1

2(x − x0)^TΓ⁻¹_prior(x − x0)

"

(33)

x _MAP = arg max

x ∈R

ⁿ

π _posterior (x | y)

ln _posterior(x | y) = 1

2(y Ax)^T _noise¹ (y Ax) + 1

2(x x₀)^T _prior¹ (x x₀)

= 1

2 y Ax ² ₁

noise

+ 1

2 x x₀ ² ₁

prior

x

_MAP

= arg min

x∈Rⁿ

!

− ln π

posterior

(x | y)

^"

π_posterior(x|y) ∝ exp

!

−1

2(y − Ax)^TΓ⁻¹_noise(y − Ax)

"

×exp

!

−1

2(x − x⁰)^TΓ⁻¹_prior(x − x⁰)

"

(34)

Optimise

g(x_MAP) := _x 1

2 y Ax ² ₁

noise

+ 1

2 x x₀ ² ₁

prior x=x_map

= A^T _noise¹ (y Ax_map) + _prior¹ (x_map x₀)

= 0

x_MAP = ^!Γ⁻¹_prior − A^T Γ⁻¹_noiseA^"⁻¹ (A^T Γ_noisey + Γ_priorx₀)

(35)

x_MAP = ^!Γ⁻¹_prior − A^T Γ⁻¹_noiseA^"⁻¹ (A^T Γ_noisey + Γ_priorx₀)

H := ∇ ^x g = Γ ⁻¹ _prior − A ^T Γ ⁻¹ _noise A

(36)

8.1 Basics 8 GAUSSIANS

then

p(x_a|x^b) = N^xa(ˆµ_a, ˆ⌃_a) n ˆµ_a = µ_a + ⌃_c⌃_b ¹(x_b µ_b)

⌃ˆ _a = ⌃_a ⌃_c⌃_b ¹⌃^T_c (353) p(x_b|x^a) = N^xb(ˆµ_b, ˆ⌃_b) n ˆµ_b = µ_b + ⌃^T_c ⌃_a ¹(x_a µ_a)

⌃ˆ _b = ⌃_b ⌃^T_c ⌃_a ¹⌃_c (354) Note, that the covariance matrices are the Schur complement of the block matrix, see 9.1.5 for details.

8.1.4 Linear combination

Assume x ⇠ N (m^x, ⌃_x) and y ⇠ N (m^y, ⌃_y) then

Ax + By + c ⇠ N (Am^x + Bm_y + c, A⌃_xA^T + B⌃_yB^T ) (355) 8.1.5 Rearranging Means

N^Ax[m, ⌃] =

pdet(2⇡(A^T ⌃ ¹A) ¹)

pdet(2⇡⌃) N^x[A ¹m, (A^T ⌃ ¹A) ¹] (356) If A is square and invertible, it simplifies to

N^Ax[m, ⌃] = 1

| det(A)|N^x[A ¹m, (A^T ⌃ ¹A) ¹] (357) 8.1.6 Rearranging into squared form

If A is symmetric, then 1

2x^T Ax + b^T x = 1

2(x A ¹b)^T A(x A ¹b) + 1

2b^T A ¹b 1

2Tr(X^T AX) + Tr(B^T X) = 1

2Tr[(X A ¹B)^T A(X A ¹B)] + 1

2Tr(B^T A ¹B) 8.1.7 Sum of two squared forms

In vector formulation (assuming ⌃₁, ⌃₂ are symmetric) 1

2(x m₁)^T ⌃₁ ¹(x m₁) (358) 1

2(x m₂)^T ⌃₂ ¹(x m₂) (359)

= 1

2(x m_c)^T ⌃_c ¹(x m_c) + C (360)

⌃_c ¹ = ⌃₁ ¹ + ⌃₂ ¹ (361)

m_c = (⌃₁ ¹ + ⌃₂ ¹) ¹(⌃₁ ¹m₁ + ⌃₂ ¹m₂) (362)

C = 1

2(m^T₁ ⌃₁ ¹ + m^T₂ ⌃₂ ¹)(⌃₁ ¹ + ⌃₂ ¹) ¹(⌃₁ ¹m₁ + ⌃₂ ¹m₂)(363) 1⇣

m^T ⌃ ¹m + m^T ⌃ ¹m ⌘

(364) ln _posterior(x | y) = 1

2(y Ax)^T _noise¹ (y Ax) + 1

2(x x₀)^T _prior¹ (x x₀)

= 1

2 y Ax ² ₁

noise

+ 1

2 x x₀ ² ₁

prior

(37)

ln _posterior(x | y) = 1

2 y Ax ² ₁

noise

+ 1

2 x x₀ ² ₁

prior

= 1

2 x x_{MAP H}

π _posterior ∼ N (x _MAP , H ⁻¹ )

(38)

Back to the problem…

(39)

Find x_MAP that satisfies:

minx

1

2 y f (x ) ² ₁

noise

+ 1

2 x x₀ ² ₁

prior

,

where the parameter-to-observable map f : Rⁿ R^m is is defined such that:

F(f (x ), x ) = D_v (f (x ), x ) dx = 0 v H_D¹ ( ), x L²( ), where

(u, x ) = (u, x ) dx t · u ds, (u, x ) = x

2(I_c d) x ln(J) +

2 ln(J)², F = X = I + u,

C = F^T F, I_C = tr(C),

J = detF.

(40)

from dolfin import *

mesh = UnitSquareMesh(32, 32)

!U = VectorFunctionSpace(mesh, "CG", 1) V = FunctionSpace(mesh, "CG", 1)

# solution

u = Function(U)

# test functions v = TestFunction(U)

# incremental solution du = TrialFunction(U)

mu = interpolate(Constant(1.0), V)

lmbda = interpolate(Constant(100.0), V)

!dims = mesh.type().dim() I = Identity(dims)

F = I + grad(u) C = F.T*F

J = det(F) Ic = tr(C)

!dims = mesh.type().dim() I = Identity(dims)

F = I + grad(u) C = F.T*F

J = det(F) Ic = tr(C)

phi = (mu/2.0)*(Ic - dims) - mu*ln(J) + (lmbda/

2.0)*(ln(J))**2 Pi = phi*dx

# gateux derivative with respect to u in direction v F = derivative(Pi, u, v)

# and with respect to u in direction du J = derivative(F, u, du)

!u_h = Function(U)

F_h = replace(F, {u: u_h}) J_h = replace(J, {u: u_h})

solve(F_h == 0, u_h, bcs, J=J_h)

(41)

u_obs << File(“observations.xdmf”)

J = Functional(inner(u - u_obs, u - u_obs)*dx + \ inner(mu, mu))

m = Control(mu)

J_hat = ReducedFunctional(J, m) ...

dJdm = Jhat.derivative()[0]

H = Jhat.hessian(dm)[0]

(42)

Wait a second…

x _MAP

x

cov(x | y)

(43)

x _MAP

x _CM

cov(x | y)

H ⁻¹ (x _MAP )

(44)

π _posterior ∼ N (x MAP , H ⁻¹ )

π _posterior ^approx ∼ N (x MAP , H ⁻¹ (x _MAP ))

x_MAP

x

cov(x | y)

H⁻¹(x_MAP)

(45)

Solving

Strategy: Use hooks in dolfin-adjoint to solve with petsc4py-based contexts.

(46)

• Parameter-to-observable map. Newton-Krylov method.

• Inner solve with GAMG preconditioned GMRES (KSP) with near-null-space set.

• Newton with second-order backtracking line search (SNES).

(47)

• 20% extension test, 16 Core Xeon, 1.12 million cells, ~29 secs to residual of 1E-10.

Colin27 brain atlas

(48)

• MAP estimator.

• Bound constrained Quasi-Newton BLMVM with More-Thuente line search (TAO).

• No Riesz map.

(49)

• Principal Component Analysis.

• Trailing: BLOPEX Locally Optimal Block Preconditioned Gradient method (SLEPc/

BLOPEX).

• Leading: Krylov Schur (SLEPc).

(50)

(51)

(52)

(53)

(54)

(55)

Back to uncertainty

quantification…

(56)

(57)

(58)

Q: What can we infer about the parameters inside the domain, just from displacement observations on the

outside?

Q: Which parameters am I most uncertain about?

(59)

π _posterior ∼ N (x MAP , H ⁻¹ )

π _posterior ^approx ∼ N (x MAP , H ⁻¹ (x _MAP ))

x_MAP

x_CM

cov(x | y)

H⁻¹(x_MAP)

(60)

H ∈ R ⁿ ^×n

Big Dense

Expensive to calculate Only have action

(61)

H = QΛQ ^T

Γ _post = H ⁻¹ = Q ^T Λ ⁻¹ Q

(62)

(63)

Trailing Eigenvector

Direction in parameter space least constrained by the observations

(64)

x _map

(65)

(66)

(67)

Leading Eigenvectors

Direction in parameter space most constrained by the observations

(68)

(69)

(70)

(71)

(72)

(73)

Matches trends from Flath et al. p424 for linear parameter to observable maps.

Γ _prior

Γ_post

(74)

Full Hessian.

4000+ actions.

2000 to calculate H. 2000 to extract.

Partial Hessian.

501 actions for 292 for leading.

209 to calculate H. 292 to extract.

Huge savings in computational cost.

Scales with model dimension.

Implement low-rank update.

(75)

Summary

• We are developing methods to access uncertainty in the recovered parameters in hyperelastic materials.

• This is done within the framework of Bayesian inversion.

• dolfin-adjoint makes assembling the equations relatively easy, solving them is tougher.

• Next steps: efficient low-rank updates, Hamiltonian MCMC.

A Bayesian inversion approach to recovering material parameters in hyperelastic solids using dolfin-adjoint