• Aucun résultat trouvé

Reduced minimax state estimation

N/A
N/A
Protected

Academic year: 2021

Partager "Reduced minimax state estimation"

Copied!
27
0
0

Texte intégral

(1)

HAL Id: inria-00550729

https://hal.inria.fr/inria-00550729

Submitted on 29 Dec 2010

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Vivien Mallet, Sergiy Zhuk

To cite this version:

Vivien Mallet, Sergiy Zhuk. Reduced minimax state estimation. [Research Report] RR-7500, INRIA.

2010, pp.23. �inria-00550729�

(2)

a p p o r t

d e r e c h e r c h e

0249-6399ISRNINRIA/RR--7500--FR+ENG

Observation and Modeling for Environmental Sciences

Reduced minimax state estimation

Vivien Mallet — Sergiy Zhuk

N° 7500 — version 1.0

initial version December 2010 — revised version Décembre 2010

(3)
(4)

Centre de recherche INRIA Paris – Rocquencourt

Domaine de Voluceau, Rocquencourt, BP 105, 78153 Le Chesnay Cedex

Vivien Mallet ✝ ✿ , Sergiy Zhuk

Theme : Observation and Modeling for Environmental Sciences Equipe-Projet Clime´

Rapport de recherche n➦7500 — version 1.0 — initial version December 2010

— revised version D´ecembre 2010 —23pages

Abstract: A reduced minimax state estimation approach is proposed for high- dimensional models. It is based on the reduction of the ordinary differential equation with high state space dimension to the low-dimensional Differential- Algebraic Equation (DAE) and on the subsequent application of the minimax state estimation to the resulting DAE. The DAE is composed of a reduced state equation and of a linear algebraic constraint. The later allows to bound linear combinations of the reduced state’s components in order to prevent possible instabilities, originating from the model reduction. The method is robust as it can handle model and observational errors in any shape, provided they are bounded. We derive a minimax algorithm adapted to computations in high- dimension. It allows to compute both the state estimate and the reachability set in the reduced space.

Key-words: minimax, reduction, differential-algebraic equations, estimation, filtering

INRIA

CEREA, joint laboratory ´Ecole des Ponts ParisTech - EDF R&D, Universit´e Paris-Est

Taras Shevchenko National University of Kyiv

(5)

esum´e : Nous introduisons une m´ethode de filtrage d´edi´ee aux mod`eles de grande dimension et fond´ee sur une approche minimax r´eduite. La m´ethode repose sur une reformulation du probl`eme de grande dimension en une ´equation diff´erentielle alg´ebrique de petite dimension sur laquelle un filtre minimax est appliqu´e. L’´equation diff´erentielle alg´ebrique se d´ecompose en une ´equation sur un ´etat r´eduit et une contrainte alg´ebrique lin´eaire. Cette dernier permet de borner des combinaisons lin´eaires des composantes du vecteur d’´etat r´eduit, ce qui ´elimine des instabilit´es potentiellement induites par la r´eduction. La m´ethode est robuste dans le sens o`u elle permet de traiter n’importe quelle erreur mod`ele et n’importe quelle erreur d’observation, pourvu que ces derni`eres soient born´ees. Nous proposons une forme algorithmique qui permet d’appliquer le filtre `a des mod`eles de grande dimension. L’algorithme calcule l’estimateur minimax ainsi que l’ensemble des ´etats admissibles.

Mots-cl´es : minimax, r´eduction, ´equations diff´erentielles alg´ebriques, estima- tion, filtrage

(6)

Contents

1 Introduction 4

2 Notation 5

3 Extended minimax state estimation 6

3.1 Minimax filter for Ordinary Differential Equations with discrete

time . . . . 6

3.2 Minimax filter for Differential-Algebraic Equations with discrete time . . . . 7

3.3 The case of the non-singular gain . . . . 8

4 Model reduction 10 4.1 Classical reduction . . . . 10

4.1.1 Instability of the reachability set . . . . 11

4.2 Generalized reduction by means of DAE . . . . 13

4.2.1 Linear case . . . . 13

4.3 Extended minimax state estimation for DAE . . . . 16

4.3.1 Notes about the reduction . . . . 16

5 Algorithm and computations 17 5.1 Derivation of the computational form for the gainGt . . . . 17

5.1.1 First form compatible with computations in high dimension 18 5.1.2 Second form compatible with computations in high di- mension, for the gain. . . . 19

5.2 Algorithm . . . . 19

6 Consistency with Kalman filter 21

A Sherman-Morrison-Woodbury formula 22

(7)

1 Introduction

Numerical modeling of complex systems such as the Earth’s atmosphere involves complex numerical models relying on systems of coupled Partial Differential Equations (PDEs). As an example, consider chemistry-transport models that describe the fate of the pollutants in the atmosphere (e.g., the models described in [Mallet et al., 2007]). For these models, the dimension of the state vector1 can reach 107 or even more, and the time integration has such a large compu- tational cost that only the equivalent of a few dozens of model calls may be affordable. The computational costs of these models and their dimensions raise specific issues when one wants to reduce simulation errors (caused by imperfect model formulation or uncertain inputs) through assimilation of the observed data (sparse observations of the model’s state) into the model. Classical as- similation algorithms such as the Kalman filters [Balakrishnan, 1984] can be so demanding in terms of computations that they cannot be applied to these models without a reduction.

Reduced Kalman filters have been developed to address this issue by intro- ducing a reduction of in the filtering algorithm—see [Wu et al., 2008] for an ap- plication to the aforementioned chemistry-transport models. In these filters, the key reduction lies in the propagation of the state error covariance matrix which is intractable2 in Kalman filter. A popular reduced Kalman filter is the so-called ensemble Kalman filter in which the state error covariance matrix is approx- imated by the empirical variance of the ensemble [Heemink et al., 2001]. The particles can be deterministically sampled like in the SEIK versions[Pham, 2001], in the unscented Kalman filter [Julier and Uhlmann, 1997] or in its reduced version [Moireau and Chapelle, 2010]. Another example is the reduced-rank square-root Kalman filter based on propagation of the most important modes [Verlaan and Heemink, 1995] of the error covariance matrix.

Another direction is the reduction of the model itself and subsequent ap- plication of an appropriate filtering technique to the resulting low-dimensional model. The Galerkin projection represents one of the most used techniques for model reduction [Brenner and Scott, 2005]. The idea is to find a low di- mensional subspace in the model state space and to restrict the model onto that subspace. Of course, there is a loss of information due to restricting the dynamics of the model onto the subspace. One way to minimize the loss is to generate the subspace by means of the Proper Orthogonal Decomposition (POD) [Homescu et al., 2005].

In this report, we introduce a reduced minimax filter, designed to be applied to high-dimensional models. Minimax approach allows 1) to filter out any model error with bounded energy and observational error either deterministic with bounded energy or stochastic with bounded variance., 2) to estimate the worst- case error and 3) to assess how accurate the link between the model and observed phenomena is. Our approach is to make a reduction of the model itself and to apply the minimax filtering to the reduced model, provided uncertain model error and observation noise are elements of a given bounding set. We introduce a reduced state equation projecting the full state vector onto a subspace which can be generated, e.g., by means of POD. The projection introduces errors

1State vector of the PDE after discretization in space.

2Since the propagation involves twice as much calls to the tangent linear model as compo- nents in the state

(8)

that can lead to a reduced state equation with unstable dynamics. In order to address this issue, we introduce an additional energy constraint on the reduced state in the form of a linear algebraic equation. Finally, our reduced model is represented by a Differential-Algebraic Equation (DAE), composed of a reduced state equation and of a linear constraint. We apply an extended version of the minimax filter for DAE [Zhuk, 2010] to the reduced model without further reduction on the filter.

The report is organized as follows. After the notation is introduced in sec- tion2, the extended minimax filter, without reduction, is presented in section3.

This section quickly explains the minimax framework, introduces the filter and comments on the intractability of the computations. The reduction procedure is then derived in section4. The classical Galerkin projection is first commented.

The DAE approach is then introduced, in the linear case and in an extended version for the non-linear case. This section also comments the generation of the projection operator. In section 5, we derive a computational form for the DAE minimax filter presented in section 3. The purpose of the derivations is to provide a form compatible with computations in high dimension. A parallel with Kalman filter can be found in section6where it is shown that the minimax filter coincides with the Kalman filter under given assumptions (in particular, without reduction).

2 Notation

LetMt:RN ÑRN define the model at some time steptP t0, . . . , T1✉:

xt 1Mtxtq et, x0xg0 e , (1) wherexg0is an approximation of the initial condition with errorePRN,xtPRN denotes the state vector, etPRN is the model error.

LetytPRmdenote the observation of the true statextat timet. We assume that ytsatisfies

ytHtxtq ηt, (2) whereHt:RN ÑRmis the observation operator mapping the state space into observation space, andηtPRm is the observation error.

We assume that the error♣e, et, ηtqis uncertain but bounded so that

①Q1♣e✁eq, e✁e②

T1 t0

①Qt1♣et✁etq, et✁et

T t0

①Rt1♣ηt✁ηtq, ηt✁ηt② ↕1, (3)

where Q, QtPRNN andR PRmmare symmetric positive-definite matrices, ande, etPRN andηtPRmmay be viewed as systematic errors.

The tangent linear model is Mt DMt♣xtq P RNN. Consistently we introduce the associated tangent linear operatorHtDHt♣xtq PRmN.

The reduction applies to the model state, and the reduced model state is denoted ztFtTxtPRn, withnN. FtPRNn is a reduction matrix. The minimax estimator ofxtis denotedxtPRN. The minimax estimator is derived from the reduced minimax estimator withxtFtzt.

The tangent linear operators along the trajectory xt are denoted Mt DMt♣♣xtqFtPRNn (fort0) andHtDHtMt1♣♣xt1q et1qFtPRmn

(9)

(for t 0), for the model and the observation operator respectively. We also defineH0DH0xg0 eqF0PRmn.

We denote byA the Moore-Penrose pseudoinverse of a matrix A. A12 is the square root of A and satisfies A A12AT2, whereAT2 is the transpose of A12. Ikkdenotes the identity matrix inRkk. ①☎,☎②denotes the canonical inner product of the Euclidean space.

3 Extended minimax state estimation

3.1 Minimax filter for Ordinary Differential Equations with discrete time

In what follows we present a minimax state estimation algorithm that solves the following filtering problem: given a sequence of observed datay0, . . . , yT in the form (2) and given the uncertainty description (3), one should estimate the statexT of (1).

Our approach is based on the following idea: to describe how the model propagates uncertain parameters verifying (3). The key point is to construct the so-called reachability setRtat timet, that is, the set of all statesxtsatisfy- ing (1) and compatible with the description of uncertain parameters (3) and the observed datayt in the form (2). In other words, the statext belongs toRtif and only if there is a sequenceE:✏ ♣e,te0, . . . , et1,tη0, . . . , ηt✉qverifying (3) such that the sequencex0, . . . , xt computed from (1) foreeandeses, 0st, is compatible with observed datay0, . . . , ytthrough (2) withηsηs, 0st. This suggests a way to estimate how the model propagates uncertain parameters (initial-condition erroreand model erroret): it is sufficient to have a description of the dynamics of Rt in time. The true state can only lie inRt. Note that the dynamics ofRttakes into account only those realizations ofe, et

which are compatible with the actual realization of observed data y0, . . . , yt. Consequently, if Rt is empty, one can conclude that the errors were wrongly described by (3).

Any point ofRtcan be the true state. In order to obtain a minimax estimate of this true state, we assign to a point x P Rt a worst-case error, that is the maximal distance between xand other points of Rt. The point with minimal worst-case error3 is the minimax estimatext. Roughly speaking, the worst-case error can be thought of as a “the longest axis” of the minimal ellipsoid containing Rt, and the minimax estimatext is the central point of that ellipsoid.

The basics of the minimax state estimation were developed by [Bertsekas and Rhodes, 1971], [Milanese and Tempo, 1985], [Chernousko, 1994], [Kurzhanski and V´alyi, 1997],

[Nakonechny, 2004]. The main advantages of minimax estimates are as follows:

(1) the possibility to filter out any model error and observation noise with bounded energy, (2) the estimation of the worst-case error, (3) fast estimation algorithms in the form of filters, (4) the possibility to evaluate the model, that is to assess how good the model describes observed phenomena.

In this subsection, we assume that Ft INN. In this case, there is no reduction and the state estimation algorithm operates on the full model. Fol- lowing [Zhuk, 2010], we introduce an extended version of the linear minimax

3This point will coincide with the Tchebysheff center of the smallest ellipsoid containing the reachability set.

(10)

state estimate xt, a minimax gainGtand the reachability setRtfor (1):

G0Q01 H0TR01H0,

x0G01

Q01e H0TR01♣y0η0q , β0✏ ①R01♣y0η0q, y0η0,

Gt 1Qt1Qt1MtBtMtTQt1 HtT 1Rt11Ht 1, Bt✏ ♣Gt MtTQt1Mtq1,

xft 1Mt♣♣xtq,

xt 1✏ ♣xft 1 Gt11HtT1Rt11ryt 1ηt 1✁ ♣Ht 1xft 1s Gt11♣Qt MtGt1MtTq1et, Xt 1✏ tw:①Gt 1w, w② ↕1✉,

βt 1βt Rt11yt 1ηt 1q, yt 1ηt 1② ✁ ①Bt Gt1xt, Gt1xt② ①①MtTQtMtet, et, Rt✏ ♣xt

1βt Gtxt,xtXt.

(4) Here Rt denotes an ellipsoidal approximation of the reachability set for the model (1) andβtis a scaling factor. The dynamics ofXtdescribes how the model Mtpropagates uncertain parameters from the bounding set (3) compatible with observed datayt. The observation-dependent scaling factorβtdefines whether Xt shrinks or expands. If 1βt Gtxt,xt② ➔ 0, then the observed data is incompatible with our assumption on uncertainty description (3). In the form (4), the minimax state estimation algorithm can be applied to non-linear models, hence we refer to it as an extended minimax filter. Nevertheless, the theory only supports the algorithm in the linear case—that is, the reachability set is known to contain all possible true states only in the linear case.

The algorithm is far too expensive for high-dimensional systems: it requires to propagate a minimax gain Gt P RNN, where N is the dimension of the state space of the model (1). For instance, with N 107 like in air quality applications, the dimension ofGtis 107107, which cannot be manipulated by modern computers because of huge computational loads and out-of-reach mem- ory requirements. Hence a reduction is necessary to carry out the computations for high dimensional systems.

3.2 Minimax filter for Differential-Algebraic Equations with discrete time

A more general form of the filter was derived in [Zhuk, 2010] for DAE problems.

The filter addresses the problem

Ft 1zt 1Mt♣Ftztq rt, F0z0F0F0T♣xg0 eq, ytHt♣Ftztq ηt, (5) with

①Q1♣eeq, ee②

T1 t0

①Qt1♣rtrtq, rtrt

T t0

Rt1ηtηtq, ηtηt② ↕1.

(6)

(11)

Here FtPRNn can be any rectangular matrix and ztPRn denotes the state of the DAE. IfFtINN, the problem statement is the same as in section3.1.

Following [Zhuk, 2010], we consider the equation for the minimax gain Gt, for any timetP t0, . . . , T1✉:

Bt

Gt MtTQt1Mt

, Gt 1FtT1

Qt1Qt1MtBtMtTQt1

Ft 1 HtT1Rt11Ht 1, (7)

with the following initialization:

G0F0TQ1F0 H0TR01H0. (8) For any timetP t0, . . . , T✉, the minimax estimator is defined as

ztGtvt, (9)

with

v0F0TQ1e H0TR01y0η0q, (10) and, fortP t1, . . . , T✉,

vtFtTQt11Mt1♣Ft1Bt1vt1q FtT

Qt11Qt11Mt1Bt1MtT1Qt11

rt1

HtTRt1♣ytηtq.

(11)

For any timetP t0, . . . , T✉, the reachability setRt is defined as Rt✏ ♣zt

1βt Gtzt,ztXt, Xt✏ tx:Gtx, x② ↕1✉ (12) withβtbeing a scaling factor depending on observations:

βt 1βt ①Rt11♣yt 1✁ηt 1q, yt 1✁ηt 1②✁①Bt Gt1zt, Gt1zt② ①①MtTQtMtrt, rt We have that the reachability set is a translation of the set Xt induced by the minimax gain Gt. The shape of Xt depends only on the model, observation operator and bounding set. Xt describes how the model propagates uncertain initial condition and model error from the bounding set (6). In contrast to the case of ODE,Gtcould be singular so thatXtcontains the kernel ofGt. In fact, the part of the system state lying in that kernel is not observable.

3.3 The case of the non-singular gain

Assume for simplicity thatrt0 ande0. Let us further assume thatGt is positive definite for all time istants t. This is the case when, for instance, Ft, fortP t0, T✉, is of full column rank, orFtTFt HtTHtis positive definite. IfGt

is positive definite, thenQt MtGt1MtT is positive definite, and according to Sherman-Morrison-Woodbury formula (see sectionA), its inverse can be written in the form

Qt MtGt1MtT

1

Qt1Qt1MtGt MtTQt1Mtq1MtTQt1. (13)

(12)

Using this identity and the gain equation (7), it is possible to writeGt 1 as Gt 1FtT1

Qt MtGt1MtT

1

Ft 1 HtT 1Rt11Ht 1, (14) which gives an alternative form to the filter. It also proves thatGt is positive definite for alltP t0, . . . , T.

The state estimator can be rewritten so that the model is applied directly toFtztinstead ofFtBtvt. Although it is an equivalent formulation in the linear case, it can make a huge difference when the model is non-linear. In addition, this alternative form makes it easier to interpret the action of the filter. Starting from zt 1Gt11vt 1 (equation (9)) and the expression (11) forvt 1:

zt 1Gt11FtT1Qt1Mt♣FtBtGtztq Gt11HtT 1Rt11♣yt 1ηt 1q. (15) In case of a linear model and considering the equation forBt(7), one gets

zt 1Gt11FtT1Qt1Mt

Gt MtTQt1Mt

1 Gtzt

Gt11HtT1Rt11♣yt 1ηt 1q,

(16) which, according to the Sherman-Morrison-Woodbury formula, is equivalent to

zt 1✏Gt11FtT1Qt1Mt

Gt1Gt1MtT♣Qt MtGt1MtTq1MtGt1

Gtzt

Gt11HtT 1Rt11♣yt 1ηt 1q,

(17) which gives

zt 1✏Gt11FtT1Qt1

INN ✁ ①MtGt1MtT♣Qt MtGt1MtTq1 Mtzt

Gt11HtT 1Rt11♣yt 1ηt 1q,

(18)

and, using INN ✏ ♣Qt MtGt1MtTq♣Qt MtGt1MtTq1,

zt 1Gt11FtT1

Qt MtGt1MtT

1

Mtzt Gt11HtT 1Rt11yt 1ηt 1q. Noting thatxFtFtTx ♣IFtFtTqxand using (14) we write

zt 1Gt11Gt 1✁ ♣HtT 1Rt11Ht 1qFtT1MtFtztq Gt11FtT1

Qt MtGt1MtT

1

♣IFt 1FtT1qMt♣Ftztq Gt11HtT 1Rt11yt 1ηt 1q

FtT1MtFtztq Gt11HtT1Rt11yt 1ηt 1✁ ♣Ht 1FtT1MtFtztqq Gt11FtT1

Qt MtGt1MtT

1

♣IFt 1FtT1qMt♣Ftztq Finally

zt 1FtT1Mt♣Ftztq Gt11HtT1Rt11♣yt 1ηt 1✁ ♣Ht 1FtT1Mt♣Ftztqq Gt11FtT1

Qt MtGt1MtT

1

IFt 1FtT1qMtFtztq

(19)

(13)

4 Model reduction

We introduce a reduction method which generalizes the classical Galerkin ap- proach. In the later approach, the model state is projected onto a lower- dimensional subspace so that the dynamics of the full state is represented with a small number of scalars. However this reduction can loose some properties of the full model. For instance, the reduced state equation can introduce instabilities that are not in the full model.

Assume that for each time stept, we have a matrixFtPRnnwhose columns are linearly-independent orthonormal vectors—we therefore haveFtTFtInn. We denote byFt the linear span of the columns ofFt. The reduction consists in projecting the true statextonto this subspaceFt. We introduceztFtTxt, which is the vector of the coefficients of the projection of xt. Consequently we approximate xtwithFtzt.

4.1 Classical reduction

The main idea of the classical reduction based on the Galerkin projection is to derive the equation forztmultiplying (1) byFtT1:

zt 1FtT1xt 1FtT1Mtxtq FtT1et. (20) Recalling the definition ofzt we obtain

zt 1FtT1Mt♣Ftztq FtT1et FtT1Mt♣xtq ✁FtT1Mt♣Ftztq. (21) Let us define

ptet Mt♣xtq ✁Mt♣FtFtTxtq, (22) so that

zt 1FtT1MtFtztq FtT1pt, z0F0Txg0 eq. (23) ptis the sum of the model error and a reduction error. If we were to apply the extended minimax filter on the reduced state equation (23), we would need to evaluate the range of values thatpt can take. Sincept is state dependent and since the true state is unknown, it is hard to determine the range of pt. The natural approach to suppress the state dependence is to bound the reduction error for all plausible states. Hence we may assume that

pt⑥ ↕ ⑥et δt,

where, for instance, δt is guaranteed to exist for Lipschitz continuous models provided FtFtTxt approximates xt with finite error4. With possibly modified Q, Qt andRt, we write

Q1eeq, ee

T1 t0

Qt1ptptq,ptptq②

T t0

Rt1ηtηtq, ηtηt② ↕1 for pt defined by (22) and some pt defined as a systematic error of the new model error. Note that onlyFtT1pthas an impact onto dynamics ofzt. Noting that

FtT1ptFtT1Ft 1FtT1pt,

4Ifxt✁ rxt⑥ ↕εandκis the model Lipschitz constant, we can takeδtκε.

Références

Documents relatifs

The goal of this paper is to outline a recursive algorithm for state estimation of discrete-time systems with nonlinear dynamics and linear measurements in presence of unknown

Although the minimax approach provides a fair and convenient criterion for comparison between different estimators, it lacks some flexibility. Typically Σ is a class of functions

L’accès aux archives de la revue « Annales scientifiques de l’Université de Clermont- Ferrand 2 » implique l’accord avec les conditions générales d’utilisation (

If the kernel a is smoother then the solution x (A4, B holds with γ > β ), then the lower bound of asymptotic of minimax risks is attained only in the cases of simple

A variety of testing statistical hypotheses models can be obtained for appropriate choices of parameter sets: most stringent test of Wald (see [14]) minimax tests (see [10]),

Now we consider the minimax programming problem involving (Φ, ρ)- invex functions, and obtain optimality conditions and duality

For a Poisson point process X , Itˆo’s famous chaos expansion implies that ev- ery square integrable regression function r with covariate X can be decom- posed as a sum of

This is the second theoretical contribution of this paper: we control two families of estimators in a way that makes it possible to reach adaptive minimax rate with our