• Aucun résultat trouvé

Sensitivity analysis of parametric uncertainties and modeling errors in generalized probabilistic modeling

N/A
N/A
Protected

Academic year: 2021

Partager "Sensitivity analysis of parametric uncertainties and modeling errors in generalized probabilistic modeling"

Copied!
47
0
0

Texte intégral

(1)

ECCOMAS European Congress on Computational Methods in Applied Sciences and Engineering

Sensitivity analysis of parametric uncertainties and modeling errors

in generalized probabilistic modeling

Maarten Arnst

(2)

Motivation

blanc

Simulation-based design. Virtual testing.

Parametric uncertainties. Modeling errors.

/.

-,

()

system

*+

77

ww♦♦♦

♦♦

OO



gg

''❖

/.

-,

()

subsystem

*+

oo

///.

()

subsystem

-,

*+

/.

-,

()

subsystem

ww

*+

77♦

blanc Multidisciplinary design. Multiple components. blanc

Sensitivity analysis of parametric uncertainties, modeling errors, and multiple components in the context of generalized probabilistic modeling.

(3)

Outline

■ Motivation.

■ Outline.

■ Sensitivity analysis.

■ Generalized probabilistic modeling.

■ First illustration: parametric uncertainties and modeling errors.

(4)
(5)

Overview

GF

ED

@A

Problem

BC

definition



Sensitivity analysis. Design optimization. Model validation.

GF

ED

@A

Analysis

BC

of uncertainty

33

UQ

GF

@A

Characterizationof uncertainty

ED

BC

uu

Mechanical modeling. Statistics.

GF

ED

@A

Propagation

BC

of uncertainty

TT

Optimization methods. Monte Carlo sampling.

Stochastic expansion (polynomial chaos).

·

Intervals.

Gaussian.

• Γ

distribution.

. . .

blanc blanc

The computational cost of stochastic methods can be lowered

(6)

Overview

■ There exist many types of sensitivity analysis. ■ Local sensitivity analysis:

◆ elementary effect analysis.

◆ differentiation-based sensitivity analysis.

◆ . . .

Global sensitivity analysis:

◆ regression analysis.

◆ variance-based sensitivity analysis,

◆ correlation analysis,

◆ methods involving scatter plots,

◆ . . .

(7)

Problem setting

Characterization of uncertainty:

◆ Two statistically independent sources of uncertainty modeled as two statistically independent random variables

X

and

Y

with probability distributions

P

X and

P

Y :

(8)

Problem setting

Characterization of uncertainty:

◆ Two statistically independent sources of uncertainty modeled as two statistically independent random variables

X

and

Y

with probability distributions

P

X and

P

Y :

(X, Y ) ∼ P

X

× P

Y

.

Propagation of uncertainty:

◆ We assume that the relationship between the sources of uncertainty and the predictions is represented by a nonlinear function

g

:

Sources of uncertainty

(X, Y )

Problem

Z = g(X, Y )

Prediction

Z

(9)

Problem setting

Characterization of uncertainty:

◆ Two statistically independent sources of uncertainty modeled as two statistically independent random variables

X

and

Y

with probability distributions

P

X and

P

Y :

(X, Y ) ∼ P

X

× P

Y

.

Propagation of uncertainty:

◆ We assume that the relationship between the sources of uncertainty and the predictions is represented by a nonlinear function

g

:

Sources of uncertainty

(X, Y )

Problem

Z = g(X, Y )

Prediction

Z

◆ The probability distribution

P

Z of the prediction is obtained as the image of the probability distribution

P

X

× P

Y of the sources of uncertainty under the function

g

:

(10)

Problem setting

Characterization of uncertainty:

◆ Two statistically independent sources of uncertainty modeled as two statistically independent random variables

X

and

Y

with probability distributions

P

X and

P

Y :

(X, Y ) ∼ P

X

× P

Y

.

Propagation of uncertainty:

◆ We assume that the relationship between the sources of uncertainty and the predictions is represented by a nonlinear function

g

:

Sources of uncertainty

(X, Y )

Problem

Z = g(X, Y )

Prediction

Z

◆ The probability distribution

P

Z of the prediction is obtained as the image of the probability distribution

P

X

× P

Y of the sources of uncertainty under the function

g

:

Z ∼ P

Z

= (P

X

× P

Y

) ◦ g

−1

.

Sensitivity analysis:

(11)

Geometrical point of view

Least-squares-best approximation of function

g

with function of only one input:

◆ Assessment of the significance of the source of uncertainty

X

:

g

X

= arg min

f∗ X

Z Z

g(x, y) − f

X

(x)

2

P

X

(dx)P

Y

(dy).

◆ By means of the calculus of variations, it can be readily shown that the solution is given by

g

X

=

Z

g(·, y)P

Y

(dy).

◆ In the geometry of the space of

P

X

× P

Y -square-integrable functions,

g

X∗ is the orthogonal projection of function

g

of

x

and

y

onto the subspace of functions of only

x

:

E{Z|X}

Z

(12)

Geometrical point of view

Expansion of function

g

in terms of main effects and interaction effects:

◆ Extension to assessment of significance of both sources of uncertainty

X

and

Y

:

g(x, y) = g

0

+

g

X

(x)

| {z }

main effect ofX

+

g

Y

(y)

| {z }

main effect ofY

+

g

(X,Y )

(x, y)

|

{z

}

interaction effect ofX andY

,

where

g

0

=

Z Z

g(x, y)P

X

(dx)P

Y

(dy),

g

X

(x) = g

X

(x) − g

0

=

Z

g(x, y)P

Y

(dy) − g

0

,

g

Y

(y) = g

Y∗

(y) − g

0

=

Z

g(x, y)P

X

(dx) − g

0

.

◆ Because they are obtained via orthogonal projection, the functions

g

0,

g

X,

g

Y , and

g

(X,Y ) are orthogonal functions.

◆ The property that

g

0,

g

X,

g

Y , and

g

(X,Y ) are orthogonal provides a link with other expansions, such as the polynomial chaos expansion.

(13)

Statistical point of view

Sensitivity indices = mean-square values of main effects and interaction effects:

◆ Quantitative insight into the significance of

X

and

Y

in inducing uncertainty in

Z

:

Z Z

g(x, y) − g

0

|

2

P

X

(dx)P

Y

(dy)

|

{z

}

=σ2 Z

=

Z

g

X

(x)

2

P

X

(dx)

|

{z

}

=sX

+

Z

g

Y

(y)

2

P

Y

(dy)

|

{z

}

=sY

+

Z Z

g

(X,Y )

(x, y)

2

P

X

(dx)P

Y

(dy)

|

{z

}

=s(X,Y )

.

◆ Because

g

X,

g

Y , and

g

(X,Y ) are orthogonal, there are no double product terms.

◆ Thus, the expansion of

g

(geometry) reflects a partitioning of the variance of

Z

into terms that are the variances of the main and interaction effects of

X

and

Y

(statistics), where:

s

X = portion of the variance of

Z

that is explained as stemming from

X

,

(14)

Statistical point of view

■ By the conditional variance identity, we have

s

X

= V {E{Z|X}} = V {Z} − E{V {Z|X}},

s

Y

= V {E{Z|Y }} = V {Z} − E{V {Z|Y }},

so that

s

X and

s

Y may also be interpreted as expected reductions of amount of uncertainty:

s

X = expected reduction of variance of

Z

if there were no longer uncertainty in

X

,

s

Y = expected reduction of variance of

Z

if there were no longer uncertainty in

Y

.

In contrast to the expansion of

g

and the variance partitioning of

Z

, these expressions and these interpretations of

s

X and

s

Y remain valid even if

X

and

Y

are statistically dependent.

(15)

Example

■ Let us consider a simple problem wherein

X

and

Y

are uniform r.v. with values in

[−1, 1]

,

X ∼ U ([−1, 1]),

Y ∼ U ([−1, 1]),

and the function

g

is given by

(16)

Example

■ Let us consider a simple problem wherein

X

and

Y

are uniform r.v. with values in

[−1, 1]

,

X ∼ U ([−1, 1]),

Y ∼ U ([−1, 1]),

and the function

g

is given by

z = g(x, y) = x + y

2

+ xy.

■ This problem has the expansion

g(x, y) = g

0

+ g

X

(x) + g

Y

(y) + g

(X,Y )

(x, y),

g(x, y) = g0 + gX(x) + gY (y) + g(X,Y )(x, y)

(17)

Example

■ Let us consider a simple problem wherein

X

and

Y

are uniform r.v. with values in

[−1, 1]

,

X ∼ U ([−1, 1]),

Y ∼ U ([−1, 1]),

and the function

g

is given by

z = g(x, y) = x + y

2

+ xy.

■ This problem has the expansion

g(x, y) = g

0

+ g

X

(x) + g

Y

(y) + g

(X,Y )

(x, y),

g(x, y) = g0 + gX(x) + gY (y) + g(X,Y )(x, y)

■ To this expansion corresponds the variance partitioning

σ

Z2

= s

X

+ s

Y

+ s

(X,Y )

,

(18)

Computation

■ Computation by means of a stochastic expansion method:

s

X

X

α6=0

c

2(α,0)

,

s

Y

X

β6=0

c

2(0,β)

,

with

g(x, y) =

X

(α,β)

c

(α,β)

ϕ

α

(x)ψ

β

(y).

■ Computation by means of deterministic numerical integration:

s

X

≈ Q

X

|Q

Y

g − Q

X

Q

Y

g|

2



,

s

Y

≈ Q

Y

|Q

X

g − Q

X

Q

Y

g|

2



.

■ Computation by means of Monte Carlo integration:

s

X

1

ν

ν

X

ℓ=1

g(x

, y

) −

1

ν

ν

X

k=1

g(x

k

, y

k

)

!

g(x

, ˜

y

) −

1

ν

ν

X

k=1

g(x

k

, ˜

y

k

)

!

,

s

Y

1

ν

ν

X

ℓ=1

g(x

, y

) −

1

ν

ν

X

k=1

g(x

k

, y

k

)

!

g(˜

x

, y

) −

1

ν

ν

X

k=1

g(˜

x

k

, y

k

)

!

.

(19)

Outlook

Observation:

Most applications in the literature involve scalar-valued sources of uncertainty.

Opportunity:

The concepts and methods of global sensitivity analysis are valid and useful more broadly

(20)
(21)

Overview

manufactured Designed system manufactured Manufacturing tolerances

yytt

tt

tt

tt

tt

tt

tt

Modeling approximations

$$❏

Force

//

manufactured Manufactured system manufactured

//

Response

f

//

manufactured Predictive model manufactured

// u

Response Uncertain system Response

X

OO

Response

(22)

Overview

■ Types of probabilistic approach:

Parametric approaches capture parametric uncertainties by characterizing geometrical characteristics, boundary conditions, loadings, and physical or mechanical properties as random variables or stochastic processes.

Nonparametric approaches capture modeling errors (and possibly the impact of parametric uncertainties) by directly characterizing the model as a random model without recourse to a characterization of its parameters as random variables or stochastic processes.

In structural dynamics, Soize constructed a class of nonparametric models by characterizing the reduced matrices of (a sequence of) reduced-order models as random matrices.

Output-prediction-error approaches capture modeling errors (and possibly the impact of parametric uncertainties) by adding random noise terms to quantities of interest.

(23)

Overview

Structural vibration.

[Cottereau et al. Earthquake Engng. Struct. Dyn., 2008].

Acoustics.

[Soize et al. Comput. Methods Appl. Mech. Engrg., 2008].

(24)

Deterministic model

FE model of linear dynamical behavior of dissipative structure:

[M (x)]¨

u

(t) + [D(x)] ˙u(t) + [K(x)]u(t) = f (t, x),

where

x

collects the parameters of the FE model, which may consist of material properties, loadings, geometrical characteristics, and so forth.

u

= (u

1

, . . . , u

m

)

is the (generalized) displacement vector,

f

the (generalized) external forces vector,

(25)

Parametric probabilistic model

Parametric probabilistic approach = probabilistic representation of uncertain parameters:

[M (X)] ¨

U

(t) + [D(X)] ˙

U

(t) + [K(X)]U (t) = f (t, X),

where

X

is the probabilistic representation of the uncertain parameters, which may consist of random variables, stochastic processes, and so forth.

In order to obtain the probabilistic representation of the uncertain parameters, a suitable probability distribution must be assigned to the random variables or stochastic processes.

In stochastic mechanics, methods are available for deducing a suitable probability distribution from available information, such as methods dedicated to tensor-valued fields of material properties,

(26)

Generalized probabilistic model

Generalized probabilistic approach = enhancement of parametric probabilistic model by introducing in it a probabilistic representation of modeling errors.

Step 1: Associate with the parametric probabilistic model a reduced-order probabilistic model:

[M

n

(X)] ¨

Q(t) + [D

n

(X)] ˙

Q(t) + [K

n

(X)]Q(t) = f

n

(t, X),

U

n

(t) = [Φ(X)]Q(t),

where

[M

n

(X)]

,

[D

n

(X)]

, and

[K

n

(X)]

are the reduced mass, damping, and stiffness matrices,

and

[Φ(X)]

the matrix collecting in its columns the reduction basis

ϕ

1

(X)

,

ϕ

2

(X)

,

. . .

,

ϕ

n

(X)

. Such a reduced-order probabilistic model can be obtained, for instance, by solving the eigenvalue problem associated with the mass and stiffness matrices of the parametric probabilistic model,

[K(X)]ϕ

j

(X) = λ

j

(X)[M (X)]ϕ

j

(X);

(27)

Generalized probabilistic model

Step 2: represent the reduced matrices by using random matrices:

[M

n

(X)] ¨

Q

(t) + [D

n

(X)] ˙

Q

(t) + [K

n

(X)]Q(t) = f

n

(t, X),

U

n

(t) = [Φ(X)]Q(t),

To accommodate in the reduced matrices a probabilistic representation of the modeling errors, the generalized probabilistic approach entails representing these reduced matrices as follows:

[M

n

(X)] = [L

M

(X)][Y

M

][L

M

(X)]

T

,

[D

n

(X)] = [L

D

(X)][Y

D

][L

D

(X)]

T

,

[K

n

(X)] = [L

K

(X)][Y

K

][L

K

(X)]

T

,

(28)

Generalized probabilistic model

■ To assign a suitable probability distribution to the random matrices

[Y

M

]

,

[Y

D

]

, and

[Y

K

]

, the generalized probabilistic approach uses the maximum entropy principle.

The probability distribution thus obtained is such that the mean values of

[Y

M

]

,

[Y

D

]

, and

[Y

K

]

are all equal to the identity matrix, that is,

E{[Y

M

]} = [I

n

],

E{[Y

D

]} = [I

n

],

E{[Y

K

]} = [I

n

],

and the amount of uncertainty expressed in

[Y

M

]

,

[Y

D

]

, and

[Y

K

]

is tunable by free dispersion parameters

δ

M,

δ

D, and

δ

K, respectively, defined by

δ

M

=

q

E{k[Y

M

] − [I

n

]k

2F

}/k[I

n

]k

2F

,

δ

D

=

q

E{k[Y

D

] − [I

n

]k

2F

}/k[I

n

]k

2F

,

δ

K

=

q

E{k[Y

K

] − [I

n

]k

2F

}/k[I

n

]k

2F

.

(29)
(30)

Overview

manufactured Designed system Clamped-clamped beam manufactured Manufacturing tolerances

}}④④

④④

④④

④④

④④

Modeling approximations

!!❈

Manufactured system Predictive model

manufactured

Uncertain material properties Random field of material properties

(31)

Deterministic model

[H(ω; x)] = [−ω

2

M (x) + iωD(ω; x) + K(x)]

−1

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

δ

= δ

= δ

= 0.05

(32)

Parametric probabilistic model

[H(ω; X)] = [−ω

2

M (X) + iωD(ω; X) + K(X)]

−1

,

where

X

= random field representation of shear and bulk moduli

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

(33)

Parametric probabilistic model

[H(ω; X)] = [−ω

2

M (X) + iωD(ω; X) + K(X)]

−1

,

where

X

= random field representation of shear and bulk moduli

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

δ

= δ

= δ

= 0.05

(34)

Nonparametric probabilistic model

[H(ω; X, Y )] = [Φ(X)][−ω

2

M

n

(X) + iωD

n

(ω; X) + K

n

(X)]

−1

[Φ(X)]

T

,

where

Y

= ([Y

M

], [Y

D

], [Y

K

])

with

[M

n

(X)] = [L

M

(X)][Y

M

][L

M

(X)]

T

,

[D

n

(ω; X)] = [L

D

(ω; X)][Y

D

][L

D

(ω; X)]

T

,

[K

n

(X)] = [L

K

(X)][Y

K

][L

K

(X)]

T

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

(35)

Nonparametric probabilistic model

[H(ω; X, Y )] = [Φ(X)][−ω

2

M

n

(X) + iωD

n

(ω; X) + K

n

(X)]

−1

[Φ(X)]

T

,

where

Y

= ([Y

M

], [Y

D

], [Y

K

])

with

[M

n

(X)] = [L

M

(X)][Y

M

][L

M

(X)]

T

,

[D

n

(ω; X)] = [L

D

(ω; X)][Y

D

][L

D

(ω; X)]

T

,

[K

n

(X)] = [L

K

(X)][Y

K

][L

K

(X)]

T

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

δ

= δ

= δ

= 0.10

.

(36)

Nonparametric probabilistic model

[H(ω; X, Y )] = [Φ(X)][−ω

2

M

n

(X) + iωD

n

(ω; X) + K

n

(X)]

−1

[Φ(X)]

T

,

where

Y

= ([Y

M

], [Y

D

], [Y

K

])

with

[M

n

(X)] = [L

M

(X)][Y

M

][L

M

(X)]

T

,

[D

n

(ω; X)] = [L

D

(ω; X)][Y

D

][L

D

(ω; X)]

T

,

[K

n

(X)] = [L

K

(X)][Y

K

][L

K

(X)]

T

.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

(37)

Identification

The dispersion parameters must be calibrated such that the amount of uncertainty expressed in

[Y

M

]

,

[Y

D

]

, and

[Y

K

]

reflects the significance of the modeling errors.

0

200

400

600

800

1,000

1,200

10

−13

10

−11

10

−9

10

−7 Frequency [Hz] F R F [m /N /H z ]

(38)

Convergence

Because the problem is of high dimension, we compute the sensitivity indices by using Monte Carlo integration, whereby we assess the convergence as a function of the number of samples.

0

0.25

0.5

0.75

1

·10

4

−0.5

−0.25

0

0.25

0.5

Number of samples [-]

s

ν Y

)

[−

]

(39)

Sensitivity analysis

V

{log |H

jj

(ω; X, Y )|} = s

X

(ω) + s

Y

(ω) + s

(X,Y )

(ω),

s

X

(ω) =

VarX



EY

{log |H

jj

(ω; X, Y )|}

,

s

Y

(ω) =

VarY



EX

{log |H

jj

(ω; X, Y )|}

.

0

400

800

1,200

0

0.5

1

1.5

Frequency [Hz]

σ

2 ,s Z X

[−

]

0

400

800

1,200

0

0.5

1

1.5

Frequency [Hz]

σ

2 ,s Z Y

[−

]

(40)
(41)

Overview

Stiffened panel with a hole.

(42)

Deterministic model

(43)

Nonparametric probabilistic model

After a component mode synthesis, we used the nonparametric probabilistic approach to introduce uncertainties in the submodels of the main panel and the stiffeners.

100

150

200

250

300

350

0

0.1

0.2

0.3

0.4

0.5

0.6

Frequency [Hz] P D F [1 /H z ]

(44)

Sensitivity analysis

Main panel Stiffeners

0

0.25

S e n s it iv it y in d e x [H z 2 ]

Main panel Stiffeners

0

2

4

S e n s it iv it y in d e x [H z 2 ]

(45)

Conclusion and acknowledgement

■ Global sensitivity analysis methods can help ascertain which sources of uncertainty are most significant in inducing uncertainty in predictions.

■ Although most applications in the literature involve scalar-valued sources of uncertainty, the

concepts and methods of global sensitivity analysis are valid and useful more broadly for stochastic process, random fields, random matrices, and other sources of uncertainty.

■ Generalized probabilistic modeling approaches are hybrid couplings of parametric modeling approaches (to capture parametric uncertainties) and nonparametric probabilistic modeling approaches (to capture modeling errors).

■ We discussed global sensitivity analysis of generalized probabilistic models and demonstrated its application in two illustrations from structural dynamics.

(46)

Conclusion and acknowledgement

■ This presentation can be downloaded from our institutional repository:

http://orbi.ulg.ac.be.

■ Other references:

◆ M. Arnst and J.-P. Ponthot. An overview of nonintrusive characterization, propagation, and sensitivity analysis of uncertainties in computational mechanics. International Journal for Uncertainty Quantification, 4:387–421, 2014.

◆ M. Arnst, B. Abello Alvarez, J.-P. Ponthot, and R. Boman. Itô-SDE-based MCMC method for Bayesian characterization and propagation of errors associated with data limitations.

SIAM/ASA Journal on Uncertainty Quantification, Submitted, 2016.

◆ M. Arnst and K. Goyal. Sensitivity analysis of parametric uncertainties and modeling errors in generalized probabilistic modeling. Probabilistic Engineering Mechanics, Submitted, 2016.

(47)

References

■ A. Saltelli et al. Global sensitivity analysis. Wiley, 2008.

■ J. Oakley an A. O’Hagan. Probabilistic sensitivity analysis of complex models: a Bayesian approach. Journal of the Royal Statistical Society: Series B, 66:751–769, 2004.

■ B. Sudret. Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93:964-979, 2008.

■ T. Crestaux, O. Le Maître, and J.-M. Martinez. Polynomial chaos expansion for sensitivity analysis. Reliability Engineering & System Safety, 94:1161–1172, 2009.

■ I. Sobol. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation, 55:271–280, 2001.

■ A. Owen. Better estimation of small Sobol’ sensitivity indices, ACM Transactions on Modeling and Computer Simulation, 23, 2013.

■ C. Soize. Nonparametric model of random uncertainties for reduced matrix models in structural dynamics, Probabilistic Engineering Mechanics, 15:277–294, 2000.

■ C. Soize. Generalized probabilistic approach of uncertainties in computational dynamics using random matrices and polynomial chaos decompositions. International Journal for Numerical Methods in Engineering, 81:939–970, 2010.

■ R. Cottereau, D. Clouteau, and C. Soize. Probabilistic impedance of foundation: Impact of the seismic design on uncertain soils. Earthquake Engineering and Structural Dynamics, 37:899–918, 2008.

■ C. Soize et al. Probabilistic model identification of uncertainties in computational models for dynamical systems and experimental validation. Computer Methods in Applied Mechanics and Engineering, 198:150–163, 2008.

■ E. Capiez-Lernout, C. Soize, and M. Mignolet. Post-buckling nonlinear static and dynamical analyses of uncertain cylindrical shells and experimental validation. Computer Methods in Applied Mechanics and Engineering, 271:210–230, 2014.

Références

Documents relatifs

This nonparametric model of random uncertainties does not require identifying the uncertain local parameters in the reduced matrix model of each substructure as described above for

Modeling physical systems for control refers to many expert disciplines linked to each physical domain (e.g. mechanics, rheology, electrical engineering …). A

The parametric-nonparametric (generalized) probabilistic approach of uncertainties is used to perform the prior stochastic models: (1) system-parameters uncertainties induced by

It is based on the construction of a Stochastic, Projection-based, Reduced-Order Model (SPROM) associated with a High-Dimensional Model (HDM) using three innovative ideas:

Experimental probability density function of the fundamental frequency related to experimental voice signals produced by one person (top) compared with the probability density

The paper is devoted to model uncertainties (or model form uncertainties) induced by modeling errors in computational sciences and engineering (such as in computational struc-

Due to the model uncertainties (for instance, errors induced by the introduction of simplified kinematical assumptions) and due to the data uncertainties related to the local

While results suggest that OCR errors have small impact on performance of supervised NLP tasks, the errors should be considered thoroughly for the case of unsupervised topic modeling