Meta-modelling method - State-of-the-art

State-of-the-art

2.6 Meta-modelling method

The numerical tools used by engineers to simulate the behaviour of a system can be very complex and expensive to evaluate. Sometimes, a simple evaluation can take several hours or several days.

It is avoided to use them directly in methods such as optimisation or uncertainty propagation.

Indeed, these methods realise many evaluations of the analysis code which are time consuming.

An idea to reduce significantly the computation time is to replace the complex computer code by a much simpler model. When physics allows it, this model takes the form of a basic simplification of the computer code, otherwise it takes the form of an approximation, called meta-model, obtained by evaluatingf at selected design points and interpolating or smoothing the function values obtained [JCS01].

The design of a meta-model involves the following three phases:

1. Elaborate an experimental design to generate a subset of design points. This can be done by using sampling methods such as Random sampling, Latin Hypercube Sampling (LHS), ...

2. Choose a family of models representative of a set of data (polynomials, neural networks, kriging, etc),

3. Fit the model on the subset of design points defined in step 1 by using method such as Least Square Regression or Weighted Least Square Regression.

Figure 2.13: Idea of meta-modelling techniques

In this section, we present three of the most popular meta-modelling techniques which are:

• Polynomial regression models,

• Artificial neural networks,

• Kriging models.

2.6.1 Polynomial Regression Models

Polynomial Regression Models, commonly called Response Surface Model (RSM), are the most popular methods used to fit a sample of data because of its simplicity and its ease of implemen-tation.

In RSM, the model is expressed as follows:

f(x) =P(x) +ǫ where

• f(x) is the unknown function of interest,

• P(x) is the known polynomial function of x,

• ǫis a random vector with ǫ∼N(0,σ²) (with σ²∈R).

The polynomial models involved are generally low-order polynomial models such as linear or quadratic polynomials. Linear polynomials are used as approximation method when the surface to be approximated presents little curvature. In this case, the form of the model is:

P(x) =b₀+

i=1

b_ix_i. (2.17)

Quadratic polynomials are used when the surface presents significant curvature. The quadratic polynomial model form becomes:

The number of unknown coefficients in Equation (2.17) is n+ 1 and the number of unknown coefficients in Equation (2.18) is ^(n+2)(n+1)₂ . Theses coefficients are evaluated using Least Squares Regression analysis to fit the meta-model to the existing set of data. Moreover, to fit the meta-model, the sample size should be at least two or three times the number of unknown coefficients [JDC03].

2.6.2 Artificial Neural Networks

The Artificial Neural Networks concept, further denoted ANN, is inspired from the function-ality of the human brain behaviour where billions of neurons are interconnected by synapses.

An ANN is composed of neurons (also called unit or single-unit perceptron), which are multiple nonlinear regression models which non-linearly transform their input signals to produce an out-put. ANN are well known because they present the advantage to be able to approximate any type of function and particularly, to successfully model abrupt changes in the function. They are also appreciated for their capacity to learn.

An ANN is defined by:

• its architecture (Feed-Forward or Feed-Back network, number of neurons, number of in-puts, . . . ),

• its learning process (e.g. weight estimation).

Feed-Forward ANN are networks that allow signals to go only in one way: from input to output. At the opposite, Feed-Back ANN (actually contained in recurrent neural network) are networks where the signals can go in both directions. It can produce complex networks.

Figure 2.14 presents the general architecture of an ANN. It is composed of an input layer, an hidden layer and an output layer. It can have several hidden layers containing intermediate variables. In this example, the ANN has 3 neurons in its input layer, one hidden layer with 4 neurons and one neuron in its output layer.

Figure 2.15 gives more details on how works an ANN. In this figure, (x₁, . . . , x_n) are the inputs of each neuron contained in the input layer, (w₁, . . . , w_n) are weights affected to each input,f is an activation function andb is a bias.

Figure 2.14: General architecture of ANN

Figure 2.15: Details on the ANN working process

The neuron behaviour depends on the synaptic weights w and on the activation function f which aim is to transform non-linearly the inputs of the neurons. Activation functions most often used are Heaviside function, sigmoid function and linear function [REF] which are expressed as follows:

• Heaviside function:

f(x) =

0 ifx < k 1 ifx≥k wherek∈Ris a fixed threshold.

• Sigmoid function:

f(x) = 1 1 +e^−λx

which is a differentiable function where λ is a threshold which governs the slope of the function.

• Linear function:

f(x) =x.

Once the architecture of the ANN is fixed for the approximation of a particular system, that network has to be trained: it is fit to training data coming from the original complex problem by adjusting the value of the weights w_i (which are the degree of freedom of the ANN). The objective is to configure the network such that the application of a given set of inputs results in a desired output. This training process is called learning process. There are several learning processes: supervised, unsupervised and reinforcement learning.

The most popular ANN are:

• Multi-Layer Perceptron (MLP),

• Radial Basis Function (RBF),

• Support Vector Machine (SVM).

More details on these ANN can be found in [Bad05, BL00, VGS97, SS04].

2.6.3 Kriging Interpolation

The kriging interpolation is another approximation technique. It has been first introduced by D.G. Krige in the field of spatial statistics and geostatistics [Kri51, Cre93]. Kriging models are often used to approximate response data coming from deterministic simulations.

In Kriging interpolation, the unknown model is expressed as follows [LS08]:

f(x) =P(x) +Z(x) (2.19)

where:

• f(x) is the unknown function we look for,

• P(x) is a known polynomial regression model,

• Z(x) is the realisation of a stationary, normally distributed random Gaussian process with mean zero, variance σ²∈Rand non-zero covariance matrix.

In equation (2.19), the term P(x) represents the global trend of the process. Depending on the form considered for P(x), the kriging process is said to be general, ordinary or simple. In the general case, also called universal kriging, the polynomial functionP(x) is a linear weighted combination ofnknown functionsr_i, with weighted coefficients β_i: P(x) =P_n

i=1β_ir_i(x). In the ordinary case, P(x) is an unknown constant: P(x) =p∈R. Finally, in the simple case, P(x) is a known constant: as an example P(x) = 0.

The term Z(x) creates ”localised” deviations from the polynomial part P(x) so that the kriging model interpolates the observations at the sampling points. As a result, the output of the kriging model at the sampling points is equal to the real observations. The construction of the interpolation is based on spatial covariance between points. The covariance is expressed as follows:

Cov(Z(x), Z(x^′)) =σ²R(x, x^′) whereR is the correlation function such that

R(x, x^′) =

k=1

R_k(x_k, x^′_k).

The previous correlation function is specified by the user. He has the choice between a large set of correlation functions (see [SWMH89]). The more frequently used is the Gaussian correlation function:

R_k(|x_k−x^′_k|) =e^−θ^k^(x^k^−x^′^k⁾² whereθ∈R⁺.

Kriging interpolation technique offers the advantage to provide, with the prediction of the function, an error of the prediction.

Model Uncertainty Assessment in

Dans le document THÈSE THÈSE (Page 92-97)