Recent Advances in Approximate Bayesian Computation Methodology Application in structural dynamics

(1)

HAL Id: hal-01437064

https://hal.archives-ouvertes.fr/hal-01437064

Preprint submitted on 17 Jan 2017

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Recent Advances in Approximate Bayesian Computation

Methodology Application in structural dynamics

Anis Ben Abdessalem, Keith Worden, Nikolaos Dervilis, David Wagg

To cite this version:

(2)

Recent Advances in Approximate Bayesian

Computation Methodology

Application in structural dynamics

A. Ben Abdessalem, K. Worden, N. Dervilis, D. Wagg

E-mail

a.b.abdessalem@sheffield.ac.uk

Dynamics Research Group, University of Sheffield Bristol, January 9th_{, 2017}

(3)

1 Aims

2 The general Bayesian approach

3 ABC-SMC for model selection

4 ABC-NS as a new alternative

(4)

Outline

1 Aims

(5)

Aims

The aims of this talk can be summarised in the following points:

1 Introduce the ABC as a promising alternative for model

selection and parameter estimation.

2 Introduce a new variant of ABC algorithms: ABC-NS.

3 Make a comparison between ABC-NS and ABC-SMC.

4 Illustrate the efficiency of the ABC-NS through some

(6)

Outline

1 Aims

(7)

Model selection

Strike the right balance

Goodness-of-fit Complexity of the model

Parsimony principle

(8)

The general Bayesian framework

Bayes theory:

Given a modelM, an observed data Dobs and a set of parameters θ:

π(θ_|Dobs,M) =

f (_Dobs|θ, M)π(θ|M)

π(Dobs|M) ∝ f(Dobs|θ, M)π(θ|M) f (_Dobs|θ, M): is the likelihood function,

π(θ|M): is the prior distribution,

π(Dobs|M): is a normalisation constant (marginal likelihood).

(9)

Model assessment and comparison

Information criterions

IC = _|{z}D¯

Deviance = fit of a model

+ pD |{z} Complexity of a model ¯ D =_−2max ˆ θ` log_LN(M, ˆθ`) Information criterions

Criterion Penalty Term for model M`

AIC dMl

AICC

N d_Ml N −d_Ml−1

BIC 0.5 log(N )dMl

(10)

Model assessment and comparison

ICs offer a way to estimate the generalisation performance of the model _{→ the model with the smallest IC value} should be preferred.

The ICs present some drawbacks:

Estimated in a single point, the MLE ˆθ`.

(11)

Outline

1 Aims

(12)

ABC principle

In theABC framework, the posterior distribution is given by: πABC(θ|Dsim,Dobs,M)∝fw(Dobs|Dsim, θ,M)f (Dsim|θ, M)π(θ|M)

Dsim is the simulated data.

fw(Dobs|Dsim, θ,M) is the weighting function so-called indicator function.

fw(Dobs|Dsim, θ,M) ∝ I(d(Dobs,Dsim|θ)≤ ε, M)

d(_Dobs,Dsim|θ) is a distance measure between the observed data and the simulated data.

(13)

(14)

(15)

(16)

ABC-SMC principle

ℳ1(𝜃1, 𝑥1) _{Outcome space} ℳ2(𝜃2, 𝑥2) ℳ𝑛(𝜃𝑛, 𝑥𝑛) Simulator 𝜀1 𝑑 = (𝐷, 𝑆 𝑥1)



   𝜀2 𝜀2>𝜀2>⋯ > 𝜀𝑇 𝜀𝑇 ˆ pMi =

(17)

ABC-SMC principle

Hyperparameters to be defined:

I ∆(u, u∗): distance to measure the degree of similarity. I Sequence of the tolerance thresholds: ε1 > ε2> . . . > εT. I Type of kernels to move particles and models.

I N number of particles per population. I Stopping criterion.

I The efficiency of the algorithm depends heavily on the selection of the tuning parameters.

I Always the objective is to strike the right balance between computational requirements and acceptable precision.

(18)

Example 1: Cubic and CQ models

→ Two competing models: each model is characterised by a set of parameters Θ.

M1 : m¨z +c˙z +kz +k3z3 = f (t),

M2 : m¨z +c˙z +kz +k3z3+k5z5 = f (t).

Parameter True value Lower bound Upper bound

m 1 0.1 10

c 0.05 0.005 0.5

k 50 5 500

k3 103 102 104

(19)

Example 1: Cubic and CQ models

I Input force: Gaussian with µ = 0, σ = 10.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Time [s] -40 -20 0 20 40 Force [N]

(20)

Example 1: Cubic and CQ models

→ Selection of the ABC-SMCRef. 1 _{hyperparameters:}

I Number of particles N =1000.

I Equal prior probabilities: p(_M1) = p(M2) = 1₂. I Uniform priors for model parameters.

I Uniform kernels are used to move particles.

I The tolerance thresholds sequence is set successively to be εi=1:19 = (100, 80, . . . , 0.05,0.03).

I The MSE is considered to measure the level of agreement:

∆(uobs, ˇusim) = 100 pσ2 u p X `=1

(21)

Example 1: Cubic and CQ models

I Model posterior probabilities:

M1 M2 0 0.5 p=1, ε=100 M1 M2 0 0.5 p=2, ε=80 M1 M2 0 0.5 p=3, ε=60 M1 M2 0 0.5 p=4, ε=40 M1 M2 0 0.5 p=5, ε=30 M1 M2 0 0.5 p=6, ε=20 M1 M2 0 0.5 p=7, ε=10 M1 M2 0 0.5 p=8, ε=5 M1 M2 0 0.5 p=9, ε=3 M1 M2 0 0.5 p=10, ε=2 M1 M2 0 0.5 p=11, ε=1 M1 M2 0 0.5 p=12, ε=0.5 M1 M2 0 0.5 p=13, ε=0.35 M1 M2 0 0.5 p=14, ε=0.3 M1 M2 0 0.5 p=15, ε=0.15 M1 M2 0 0.5 p=16, ε=0.1 M1 M2 0 0.5 1 p=17, ε=0.075 M1 M2 0 0.5 1 p=18, ε=0.05 M1 M2 0 0.5 1 p=19, ε=0.03

(22)

Example 1: Cubic and CQ models

I Histograms of the model parameters.

0.96 0.98 1 1.02 1.04 m 0 100 200 Frequency 0.03 0.04 0.05 0.06 0.07 c 0 100 200 Frequency 40 45 50 55 k 0 100 200 Frequency 0 500 1000 1500 2000 k₃ 0 100 200 Frequency 0 5 10 15 k₅ ×104 0 100 200 Frequency

(23)

Example 1: Cubic and CQ models

I Model parameter estimates

Parameter ΘTrue µ σ 95% CI m 1 0.9965 0.0105 [0.97, 1.014] c 0.05 0.0505 0.0062 [0.0406, 0.0611] k 50 49.5090 2.0055 [46.25, 52.84] k3 103 1.069×103 393.23 [430.38, 1.72×103] k5 105 9.58×104 1.96×104 [6.49×104, 1.29×105] I Model prediction: excellent agreement

(24)

Example 1: Cubic and CQ models

I Significant decrease of the acceptance rate:

(25)

Outline

1 Aims

(26)

ABC based on Nested Sampling

I The ABC-NS algorithm is based on the concept of the nested sampling (NS) algorithm proposed by SkillingRef. 2 and the ellipsoidal sampling technique proposed by Mukherjee et al.Ref. 3_.

The implementation of the algorithm requires the selection of a number of hyperparameters, (the values used here are given):

I β0: proportion of the alive particles (β0 = 0.6). I α0: proportion of the died particles (α0= 0.3). I f0: enlargement factor (f0= 1.1).

The first iteration of the ABC-NS algorithm is described in the pseudocode given in the next

slide. For the next iteration, (1 − β0)N particles are sampled (subjected to the constraint

∆(u, u∗) < ε2) inside the ellipse defined by its covariance matrix and mean value estimated

(27)

ABC based on Nested Sampling

I Principle of the ABC-NS algorithm:

Algorithm 1 ABC-NS sampler

Require: u: Observed data,M(·), ε1, N

1:while i≤ N do

2: repeat

3: Sample θ∗_{from the prior distributions π(·)}

4: Simulate u∗_{using the model}_M(·)

5: until ∆(u, u∗₎_{≤ ε}₁

6: set Θi= θ∗, ei= ∆(u, u∗)

7: set i = i + 1

8:end while

9:Sort e on ascending order, ε2= e(α0N ), α0∈ [0.2, 0.3]

10:ωi∝ε11

1− (ei

ε1)

2

11:Remove dead particles, ∆(u, u∗₎_{≥ ε}₂_, _ω_j=1:α

0N= 0

12:Normalise the weights such that

(1−α_X0)N i=1

ωi= 1

13:Select β0N alive particles, β0∈ [0.5, 1 − α0]

14:Define the ellipse by its covariance matrix and mean valueE1={B1, µ1}

1

Creating

a footnote is easy.1 ANIS BEN ABDESSALEM ANIS BEN

(28)

Example 2: ABC-NS for parameter estimation

1 The equation of motion is given by:

m¨y + c ˙y + ky + k3y3 = f (t)

Objective:identify the unknown parameters Θ = (k, k3) from the training data.

I m = 1. I c = 0.05. I k_{∼ U(20, 80).} I k3∼ U(500, 1500).

I Training data (free-of-noise).

(29)

Example 2: ABC-NS for parameter estimation

I Illustration of the ABC-NS algorithm.

(30)

Example 2: ABC-NS for parameter estimation

(31)

Example 2: ABC-NS for parameter estimation

(32)

Example 2: ABC-NS for parameter estimation

(33)

Example 2: ABC-NS for parameter estimation

(34)

Example 2: ABC-NS for parameter estimation

(35)

Example 2: ABC-NS for parameter estimation

(36)

Example 2: ABC-NS for parameter estimation

(37)

Example 2: ABC-NS for parameter estimation

(38)

Example 2: ABC-NS for parameter estimation

(39)

Example 2: ABC-NS for parameter estimation

(40)

Example 2: ABC-NS for parameter estimation

(41)

Example 2: ABC-NS for parameter estimation

(42)

Example 2: ABC-NS for parameter estimation

(43)

Example 2: ABC-NS for parameter estimation

(44)

Example 2: ABC-NS for parameter estimation

(45)

Example 2: ABC-NS for parameter estimation

(46)

Example 2: ABC-NS for parameter estimation

(47)

Example 2: ABC-NS for parameter estimation

(48)

Example 2: ABC-NS for parameter estimation

I Comparison between the acceptance rates.

(49)

Example 3: ABC-NS for parameter estimation

1 The equation of motion is given by:

m¨y + c ˙y + ky + k3y3 = f (t)

Objective:identify the unknown parameters Θ = (c, k, k3) from the training data (free-of-noise).

(50)

(51)

(52)

Example 3: ABC-NS for parameter estimation

(53)

Example 3: ABC-NS for parameter estimation

(54)

Example 3: ABC-NS for parameter estimation

990 50.2 995 50.1 0.052 1000 k3 p=50, "50=2.25e-04 0.051 1005 k 50 c 1010 0.05 49.9 _0.049 49.8 0.048

I Convergence of the ABC-NS algorithm I Convergence of the ABC-NS algorithm I Convergence of the ABC-NS algorithm

(55)

Example 3: ABC-NS for parameter estimation

I Histograms of the model parameters using ABC-NS.

0.048 0.049 0.05 0.051 0.052 c 0 50 100 150 200 Frequency 49.8 49.9 50 50.1 50.2 k 0 50 100 150 200 Frequency 990 995 1000 1005 1010 k3 0 50 100 150 200 Frequency

I Histograms of the model parameters using ABC-SMC.

(56)

Example 3: ABC-NS for parameter estimation

(57)

Outline Aims The general BA ABC-SMC for model selection ABC-NS as a new alternative Conclusions

Example 3: ABC-NS for parameter estimation

0 5 10 15 20 25 30 35 40 45 50 Iteration number 0 10 20 30 40 50 60 70 80 Acceptance rate × 100 ABC-NS ABC-SMC

Figure: 3 unknown parameters

0 5 10 15 20 25 30 35 40 45 50 Iteration number 0 10 20 30 40 50 60 Acceptance rate × 100 ABC-NS ABC-SMC

(58)

Example 3: ABC-NS for parameter estimation

Figure: 3 unknown parameters

(59)

Example 4: ABC-NS for Model selection

I Application of the ABC-NS for model selection.

→ Two competing models: each model is characterised by a set of parameters θ.

M1 : m¨z +c˙z +kz = f (t),

M2 : m¨z +c˙z +kz +k3z3 = f (t). →Objective: Identify the most likely model.

I The training data was synthetically generated from the cubic model (free-of-noise).

(60)

Example 4: ABC-NS for model selection

I Evolution of the number of particles for the linear model.

(61)

Example 4: ABC-NS for model selection

(62)

Example 4: ABC-NS for model selection

(63)

Example 4: ABC-NS for model selection

(64)

Example 4: ABC-NS for model selection

(65)

Example 4: ABC-NS for model selection

(66)

Example 4: ABC-NS for model selection

(67)

Example 4: ABC-NS for model selection

(68)

Example 4: ABC-NS for model selection

(69)

Example 4: ABC-NS for model selection

(70)

Example 4: ABC-NS for model selection

(71)

Example 4: ABC-NS for model selection

(72)

Example 4: ABC-NS for model selection

(73)

Example 4: ABC-NS for model selection

(74)

Example 4: ABC-NS for model selection

(75)

Example 4: ABC-NS for model selection

(76)

Example 4: ABC-NS for model selection

(77)

Model selection using ABC-NS

(78)

Model selection using ABC-NS

(79)

Model selection using ABC-NS

(80)

Model selection using ABC-NS

(81)

Model selection using ABC-NS

(82)

Model selection using ABC-NS

(83)

Model selection using ABC-NS

(84)

Model selection using ABC-NS

(85)

Model selection using ABC-NS

(86)

Model selection using ABC-NS

(87)

Example 4: ABC-NS for Model selection

I After few iterations, the linear model is eliminated and the algorithm converges to the right model: model posterior probabilities over few populations:

(88)

Example 4: ABC-NS for Model selection

I Histograms of the model parameters using ABC-NS.

0.04999 0.049995 0.05 0.050005 0.05001 c 0 50 100 150 200 Frequency 49.99949.9995 50 50.0005 50.001 k 0 50 100 150 200 Frequency 999.9 999.95 1000 1000.05 1000.1 k3 0 50 100 150 Frequency

I Histograms of the model parameters using ABC-SMC.

(89)

Example 4: ABC-NS for Model selection

I Comparison between the acceptance rates and CPU time.

(90)

Example 4: ABC-NS for Model selection

(91)

Outline

1 Aims

(92)

Conclusions

1 The parsimony principle is well embedded in the ABC

algorithm.

2 ABC-SMC and ABC-NS provide good estimates. 3 The ABC-NS outperforms the ABC-SMC in terms of

computational efficiency.

4 The ABC-NS can be applied simultaneously for parameter

estimation and model selection.

5 ABC-NS requires less parameters to be tuned compared

with the ABC-SMC.

I To run the ABC-NS algorithm, Matlab files calc ellipsoid.m and draw from ellipsoid.m

(93)

Future work

I Compare the ABC-NS with other variants of ABC algorithms.

I Introduce the use of the ABC-NS for real applications. I Extend the use of the ABC to infer systems with complex

behaviours and systems with larger datasets by extracting useful features.

I More details and examples about the implementation of the ABC-NS algorithm and

(94)

References

T. Toni, D. Welch, N. Strelkowa, A. Ipsen and M.P. Stumpf.

Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems.

Journal of the Royal Society Interface, 6, 187–202.

J. Skilling.

Nested sampling for general Bayesian computation. Bayesian Analysis, 1(4):833–860, 2006.

P. Mukherjee, D. Parkinson and A.R. Liddle.

A Nested Sampling Algorithm for Cosmological Model Selection. The Astrophysical Journal Letters, 638(2):L51, 2006.

F. Feroz, M.P. Hobson and M. Bridges.

MULTINEST: an efficient and robust Bayesian inference tool for cosmology and particle physics.

(95)

Acknowledgement

The authors gratefully acknowledge the support which has been provided through the EPSRC Programme Grant “Engineering Nonlinearity” (EP/K003836/1).

To cite this version: