Collaborators: Patrick E. Farrell, Stéphane P. A. Bordas

(1)

Using Bayes’ theorem to infer the material

parameters of human tissue

Jack S. Hale

Collaborators: Patrick E. Farrell, Stéphane P. A. Bordas

Hmsqnctbshnm Sgdaq`mc Sxonfq`ogx

Bnkntq

Ok`bhmfsgdaq`mc Rs`shnmdqx NsgdqhcdmshÜdqr Dwghahshnmr. Rhfm`fd OnvdqOnhms Vda Sq`cdl`qj. Bdqdlnmh`k Dw`lokdr Sdbgmhb`k hmenql`shnm

‹

aq`mcbnkntqr`mcbnkntqo`kdssd

›

Sgdaq`mcbnkntqÎNwenqcaktd

The University is possibly unique in having a colour with which it is associated throughout the world:Nwenqcaktd.

Nwenqcaktd(or PANTONE © 282) is the main brand colour. Please refer to page 21 for more detailed and technical information on the use of Oxford blue.

›

Sgdaq`mcl`qjrÎak`bjnmkx

In print media where only black is available (such as press

advertisement or black and white laser printing) it is acceptable to use the brand in black. However, it is important that the black artwork versions are used, as use of the Oxford blue artwork versions could result in a half-tone grey being produced.

Okd`rdmnsd9 do not print letterheads in colour on a laser printer or inkjet as the colour of the brand will not be right. Please print in black using the black artwork versions.

›

Bnkntqo`kdssd

There is also a palette of preferred colours that have been selected to compliment the Nwenqcaktd. These colours are for use in

graphic elements within designs such as backdrops, graphic shapes and typography. The colours in the palette are shown opposite.

For more technical information on Oxford blue please refer to page 20

For more technical information on the colour palette please refer to page 21

Use the black artwork versions only when no colours are available

The main brand colour is

Oxford Blue

(or Pantone^©282)

1

University of Ghent, Bayesian Afternoon Seminar

(2)

Overview

• Motivation.

• Bayesian approach to inversion.

• Relate classical optimisation techniques to the Bayesian inversion approach.

• Using a domain specific language for variational forms to solve the equations.

• Low-rank updates to deal with high-dimensional posterior

covariance.

(3)

Motivation.

(4)

(5)

(6)

(7)

Q: What can we infer about the parameters inside the domain, just from displacement observations on the

outside?

Q: Which parameters am I most uncertain about?

(8)

Framework

(9)

posterior (x | y ) _likelihood (y | x ) _prior (x )

x G

Parameter Map

+

E ∼ N (0, Γ _noise )

y y _obs

X ∼ N (¯ x , Γ _prior )

π _posterior (x | y ) ∝ exp

!

− 1

2 || y − G (u ) || ² _Γ ₋ ¹

noise − 1

2 || x − x ¯ || _Γ − 1 prior

"

Inference

(10)

G : X → Y

Inverse Problems: A Bayesian perspective.

Stuart, Acta Numerica (2010).

Contribution: Bayesian inverse problems in an infinite- dimensional setting. When is a Bayesian inverse

problem well-posed?

(11)

The displacements y for a given material parameter x are defined by a the minimum point of the following Lagrangian:

L (y , x ) = (y , x ) dx t · y ds

where the energy density functional is defined through the following equations:

(u, x ) = x

2 (I _c d ) x ln(J ) +

2 ln(J ) ² , F =

X = I + y , C = F ^T F,

I _C = tr(C), J = detF.

B 0 φ B

(12)

Even once discretised (Finite Element Method)

Colin27 brain atlas

20% extension test, 16 Core Xeon, 1.12 million cells, ~29 secs.

G _h : R ⁿ → R ^m

n = 1, 112, 000

(13)

Problems

• Evaluating parameter-to-observable map is very expensive.

• Discretised parameter space can be very large.

• Outcome: Exploring posterior with ‘traditional sampling’ is not

going to work.

(14)

Solutions

1. Connect Bayesian approach to ideas from classical optimisation. Using derivatives of posterior in parameter-space (Girolami).

2. Exploiting low-rank structure of prior to posterior

covariance updates (Flath 2012, Spantini 2015).

(15)

posterior (x | y ) _likelihood (y | x ) _prior (x )

x _MAP = arg max

x ∈R ⁿ π _posterior (x | y )

x _MAP

x _CM

cov(x | y )

(16)

Linearise

y = Ax

− ln π _posterior (x | y ) = 1

2 || y − Ax || ² _Γ ₋ ¹

noise

+ 1

2 || x − x ₀ || _Γ − 1

prior

(17)

Take the derivative

g (x _MAP ) := _x 1

2 y Ax ² ₁

noise

+ 1

2 x x ₀ ² ₁

prior x =x _map

= A ^T _noise ¹ (y Ax _map ) + _prior ¹ (x _map x ₀ )

= 0

x _MAP = ^! Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A ^" ⁻ ¹ (A ^T Γ _noise y + Γ _prior x ₀ )

(18)

MAP estimate. Bound-constrained Quasi-Newton BLMVM

(19)

and the second derivative…

H := ∇ ^x g = Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A

x _MAP = ^! Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A ^" ⁻ ¹ (A ^T Γ _noise y + Γ _prior x ₀ )

π _posterior ∼ N (x _MAP , H ⁻ ¹ )

(After a fair bit of manipulation…)

(20)

MAP estimate

x _MAP

x

cov(x | y )

(21)

x _MAP

x _CM

cov(x | y )

H ⁻ ¹ (x _MAP )

(22)

π _posterior ∼ N (x _MAP , H ⁻ ¹ )

π _posterior ^approx ∼ N (x _MAP , H ⁻ ¹ (x _MAP ))

x _MAP

x

cov(x | y )

H ⁻ ¹ (x _MAP )

(23)

Tools

• The FEniCS Project is

collection of free software for the automated, efficient

solution of differential

equations using the finite element method.

• dolfin-adjoint automatically derives the discrete adjoint, tangent linear and higher- order adjoint models from a high-level description of the forward model.

http://fenicsproject.org

http://www.dolfin-adjoint.org

Wells, Logg, Rognes, Kirby and many, many others…

Farrell, Funke, Ham and Rognes.

2015 Wilkinson Prize for Numerical Software.

(24)

The displacements y for a given material parameter x are defined by a the minimum point of the following Lagrangian:

L (y , x ) = (y , x ) dx t · y ds

where the energy density functional is defined through the following equations:

(u, x ) = x

2 (I _c d ) x ln(J ) +

2 ln(J ) ² , F =

X = I + y , C = F ^T F,

B 0 φ B

(25)

from dolfin import *

mesh = UnitSquareMesh(32, 32)

U = VectorFunctionSpace(mesh, "CG", 1) V = FunctionSpace(mesh, "CG", 1)

# solution

u = Function(U)

# test functions v = TestFunction(U)

# incremental solution du = TrialFunction(U)

x = interpolate(Constant(1.0), V)

lmbda = interpolate(Constant(100.0), V) dims = mesh.type().dim()

I = Identity(dims) F = I + grad(u) C = F.T*F

J = det(F)

phi = (x/2.0)*(Ic - dims) - x*ln(J) + (lmbda/

2.0)*(ln(J))**2 Pi = phi*dx

# gateux derivative with respect to u in direction v F = derivative(Pi, u, v)

# and with respect to u in direction du J = derivative(F, u, du)

u_h = Function(U)

F_h = replace(F, {u: u_h}) J_h = replace(J, {u: u_h})

solve(F_h == 0, u_h, bcs, J=J_h)

The displacements y for a given material parameter x are defined by a the minimum point of the following Lagrangian:

L(y , x) = (y , x) dx t · y ds

where the energy density functional is defined through the following equations:

(u, x) = x

2(I_c d) x ln(J) +

2 ln(J)², F =

X = I+ y , C = F^TF,

I_C = tr(C), J = detF.

(26)

(27)

(28)

Q: What can we infer about the parameters inside the domain, just from displacement observations on the

outside?

Q: Which parameters am I most uncertain about?

(29)

x _map

(30)

Optimal low-rank updates

Γ _prior y Γ _posterior

Critical idea: Observations are only informative on a low-dimensional subspace of the parameter

space, relative to the prior. Spantini et al. (arXiV)

d _F ² (A, B ) =

i

ln ² ( _i ) Aw = B w

Förstner metric

(31)

Matrix-free Krylov-Schur (SLEPc).

(32)

(33)

Matches trends from Flath et al. p424 for linear parameter to observable maps.

Γ _prior

Γ _post

(34)

Trailing Eigenvector

Direction in parameter space least constrained by the

observations

(35)

(36)

(37)

Leading Eigenvectors

Direction in parameter space most constrained by the

observations

(38)

(39)

(40)

(41)

41

Full Hessian.

4000+ actions.

Low-rank update.

292 actions.

Huge savings in computational cost.

Scales with model dimension because observations

stay the same.

(42)

Collaborators: Patrick E. Farrell, Stéphane P. A. Bordas

Using Bayes’ theorem to infer the material

parameters of human tissue

Jack S. Hale

Collaborators: Patrick E. Farrell, Stéphane P. A. Bordas

‹

›

›

›

Overview

• Motivation.

• Bayesian approach to inversion.

• Relate classical optimisation techniques to the Bayesian inversion approach.

• Using a domain specific language for variational forms to solve the equations.

• Low-rank updates to deal with high-dimensional posterior

covariance.

Motivation.

Q: What can we infer about the parameters inside the domain, just from displacement observations on the

outside?

Q: Which parameters am I most uncertain about?

Framework

posterior (x | y ) likelihood (y | x ) prior (x )

x G

Parameter Map

+

E ∼ N (0, Γ noise )

y y obs

X ∼ N (¯ x , Γ prior )

π posterior (x | y ) ∝ exp

!

− 1

2 || y − G (u ) || 2 Γ − 1

noise − 1

2 || x − x ¯ || Γ − 1 prior

"

Inference

G : X → Y

Inverse Problems: A Bayesian perspective.

Stuart, Acta Numerica (2010).

Contribution: Bayesian inverse problems in an infinite- dimensional setting. When is a Bayesian inverse

problem well-posed?

The displacements y for a given material parameter x are defined by a the minimum point of the following Lagrangian:

L (y , x ) = (y , x ) dx t · y ds

where the energy density functional is defined through the following equations:

(u, x ) = x

2 (I c d ) x ln(J ) +

2 ln(J ) 2 , F =

X = I + y , C = F T F,

I C = tr(C), J = detF.

B 0 φ B

Even once discretised (Finite Element Method)

Colin27 brain atlas

20% extension test, 16 Core Xeon, 1.12 million cells, ~29 secs.

G h : R n → R m

n = 1, 112, 000

Problems

• Evaluating parameter-to-observable map is very expensive.

• Discretised parameter space can be very large.

• Outcome: Exploring posterior with ‘traditional sampling’ is not

going to work.

Solutions

1. Connect Bayesian approach to ideas from classical optimisation. Using derivatives of posterior in parameter-space (Girolami).

2. Exploiting low-rank structure of prior to posterior

covariance updates (Flath 2012, Spantini 2015).

posterior (x | y ) likelihood (y | x ) prior (x )

x MAP = arg max

x ∈R n π posterior (x | y )

x MAP

x CM

cov(x | y )

Linearise

y = Ax

− ln π posterior (x | y ) = 1

2 || y − Ax || 2 Γ − 1

noise

+ 1

2 || x − x 0 || Γ − 1

prior

Take the derivative

g (x MAP ) := x 1

posterior (x | y ) _likelihood (y | x ) _prior (x )

E ∼ N (0, Γ _noise )

y y _obs

X ∼ N (¯ x , Γ _prior )

π _posterior (x | y ) ∝ exp

2 || y − G (u ) || ² _Γ ₋ ¹

2 || x − x ¯ || _Γ − 1 prior

2 (I _c d ) x ln(J ) +

2 ln(J ) ² , F =

X = I + y , C = F ^T F,

I _C = tr(C), J = detF.

G _h : R ⁿ → R ^m

posterior (x | y ) _likelihood (y | x ) _prior (x )

x _MAP = arg max

x ∈R ⁿ π _posterior (x | y )

x _MAP

x _CM

− ln π _posterior (x | y ) = 1

2 || y − Ax || ² _Γ ₋ ¹

2 || x − x ₀ || _Γ − 1

g (x _MAP ) := _x 1

2 y Ax ² ₁

2 x x ₀ ² ₁

prior x =x _map

= A ^T _noise ¹ (y Ax _map ) + _prior ¹ (x _map x ₀ )

x _MAP = ^! Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A ^" ⁻ ¹ (A ^T Γ _noise y + Γ _prior x ₀ )

H := ∇ ^x g = Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A

x _MAP = ^! Γ ⁻ _prior ¹ − A ^T Γ ⁻ _noise ¹ A ^" ⁻ ¹ (A ^T Γ _noise y + Γ _prior x ₀ )

π _posterior ∼ N (x _MAP , H ⁻ ¹ )

x _MAP

x _MAP

x _CM

H ⁻ ¹ (x _MAP )

π _posterior ∼ N (x _MAP , H ⁻ ¹ )

π _posterior ^approx ∼ N (x _MAP , H ⁻ ¹ (x _MAP ))

x _MAP

H ⁻ ¹ (x _MAP )

2 (I _c d ) x ln(J ) +

2 ln(J ) ² , F =

X = I + y , C = F ^T F,

x _map

Γ _prior y Γ _posterior

d _F ² (A, B ) =

ln ² ( _i ) Aw = B w

Γ _prior

Γ _post