HAL Id: hal-00119411
https://hal.archives-ouvertes.fr/hal-00119411
Submitted on 9 Dec 2006
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Nonlinear system identification using uncoupled state multiple-model approach
Rodolfo Orjuela, Didier Maquin, José Ragot
To cite this version:
Rodolfo Orjuela, Didier Maquin, José Ragot. Nonlinear system identification using uncoupled state
multiple-model approach. Workshop on Advanced Control and Diagnosis, ACD’2006, Nov 2006,
Nancy, France. pp.CDROM. �hal-00119411�
NON-LINEAR SYSTEM IDENTIFICATION USING UNCOUPLED STATE
MULTIPLE-MODEL APPROACH
Rodolfo Orjuela, Didier Maquin and Jos´e Ragot
Centre de Recherche en Automatique de Nancy UMR 7039, Nancy-Universit´e, CNRS 2, Avenue de la Forˆet de Haye, 54516 Vandœuvre-l`es-Nancy Cedex France
Abstract: Multiple-model approach is an interesting alternative and a powerful tool for modelling complex processes. This paper deals with the off-line identification of non-linear systems employing the multiple-model approach. We use an uncoupled state multiple-model in opposition to the classically used coupled state multiple- model (Takagi-Sugeno). The use of this new mutiple-model structure reveals a new undesirable phenomenon, called unhooking, that deteriorates the quality of the obtained approximation. An original solution is proposed to avoid this phenomenon.
Keywords: Multiple-models, Non-linear system identification, Model structure
1. INTRODUCTION
Non-linear models are widely used in engineer- ing science applications to describe the dynamic behaviour of real-world processes. Due to their mathematical complexity, they are not easily ex- ploitable for designing a control law or setting up a diagnosis strategy, even if they give a good description of the considered process.
Assuming that the modelled process evolves around an operating point, a linearisation stage of the non-linear model is then possible and leads to the reduction of the mathematical complexity of the model. Hence, analysis tools available for linear systems can be used. However, in practice, the linearity assumption is not always respected and consequently the linearised model does not represent the whole behaviour of the process.
Moreover, it will be more interesting in practice to have a model that gives a good global charac- terisation of the dynamic behaviour of the process and that is easily usable by engineers for example with techniques for linear systems.
To fulfil these expectations, new modelling tech- niques have been developped, among them multiple- model approach. We propose in this paper an off-line identification procedure of non-linear sys- tems using an uncoupled state multiple-model ap- proach.
The outline of this paper is as follows. Two multiple-model structures using coupled and un- coupled states are discussed in section 2. Section 3 presents the parametric estimation procedure for uncoupled state multiple-model. The limits and problems encountered in the application of this procedure are illustrated through an academic ex- ample in section 4. A modification of the multiple- model structure is then introduced in section 5.
The goal is to improve the approximation qual- ity and performances of the identified multiple- model.
2. STRUCTURES OF MULTIPLE-MODELS The divide and conquer strategy is the starting point of many modelling techniques of non-linear systems (N.L.S.). The basis of these techniques is
the decomposition of operating space of a non- linear system into a numberLof operating zones that are characterised by a sub-model which is often chosen as linear. According to the zone where the non-linear system evolves then, the output ˆyi of each sub-system is more or less requested in order to describe the whole behaviour of the non-linear system.
The multiple-model (M.M.) output ˆy(k) is defined by:
ˆ y(k) =
XL
i=1
µi(ξ(k))ˆyi(k), (1) the sub-model contribution depends on theweight- ing function µi(ξ(k)). A large choice of weighting functions is possible (it is not the purpose of the present text to survey these possibilities). Here, they are obtained from normalised Gaussian func- tion:
µi(ξ(k)) = ωi(ξ(k)) PL j=1
ωj(ξ(k))
, where (2)
ωi(ξ(k)) = exp
−(ξ(k)−ci)2/σ2
, (3)
ci is the centre of the ith weighting function and σ is the dispersion for all weighting functions.
Decision variable ξ of weighting functions can depend on the measurable state variables and/or input/output variable. Two notions of weighting functions are defined strongly blended and not much blended (see Figure 1). There are various
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
µi
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.2 0.4 0.6 0.8 1
u(k) µi
Fig. 1. Weighting functions strongly blended σ = 0.35 (up), weighting functions not much blendedσ = 0.08 (down)
ways of connecting sub-models in order to gen- erate the global output ˆy(k). We can distinguish two multiple-model structures according the use ofcoupled oruncoupled states.
2.1 Coupled state multiple-model
The coupled state structure or Takagi-Sugeno (Takagi and Sugeno, 1985) is the more classically used in multiple-model analysis and synthesis (Murray-Smith and Johansen, 1997). The global output of this multiple-model is computed by:
ˆ
x(k+ 1) = ˜A(k)ˆx(k) + ˜B(k)u(k) + ˜D(k), (4) ˆ
y(k) = ˜C(k)ˆx(k),
such that:
A(k) =˜ XL
i=1
µi(ξ(k))Ai, B(k) =˜ XL
i=1
µi(ξ(k))Bi, D(k) =˜
XL
i=1
µi(ξ(k))Di,C(k) =˜ XL
i=1
µi(ξ(k))Ci, where ˆx ∈ Rn is the state vector, u ∈ Rm the control, ˆy ∈ Rl the output vector and ξ the decision variable of weighting function µi. The coupled state multiple-model is assimilated to a system whose parameters vary with time (notice that there is an only global state ˆx). Indeed, the sub-model parameters are blended in function of operating zones of the non-linear system.
2.2 Uncoupled state multiple-model
Filev (Filev, 1991) proposed another structure to connect the sub-models by an uncoupled struc- ture:
ˆ
xi(k+ 1) =Aixˆi(k) +Biu(k) +Di, ˆ
yi(k) =Cixˆi(k), (5) ˆ
y(k) = XL
i=1
µi(ξ(k))ˆyi(k),
where ˆxi ∈Rni is the state vector of the ith sub- model,u∈Rmis the control, ˆy∈Rlis the output vector. Here, the global output of the multiple- model is given by a weighted sum of the sub- system outputs (parameters of sub-models are not blended).
Notice that this multiple-model is given by paral- lel connection of Wiener-type sub-models (a linear sub-model in series with a non-linear function).
Therefore each sub-model has and evolves inde- pendently in their own state space in function of the input control and their initial state. Let us be clear that the principal interest of this structure is the decoupling between the sub-models. Indeed, it is possible to think that:
• it is easier to transfer the analysis techniques of linear systems to multiple-model;
• different structures of sub-models can be employed. For example, linear and non-linear models with a different dimension of the state vector (of course the output vector dimension must be identical).
3. PARAMETRIC ESTIMATION The coupled state multiple-model has been em- ployed in several identification procedures of N.L.S in the last few years.
Johansen and Foss (Johansen and Foss, 1995) have developed an algorithm for operating zone
decomposition of N.L.S. based on a priori process information and a heuristic. The sub-models are added at each iteration in function of non-linear system complexity and the wished accuracy (as- cending approach).Gasso(Gasso, 2000) proposed another operating space decomposition technique of the N.L.S. based on a grid partition. The multiple-model complexity is reduced by elimi- nation of irrelevant sub-models on one hand and merging redundant models on the other one (de- scending approach).Gugaliya and al.(Gugaliyaet al., 2005) suggest a modification of CART (Clas- sification and Regression Tree) algorithm used in the piecewise linear model identification in order to constructing a multiple-model.
On the other hand, the identification procedure based on the uncoupled state multiple-model has been not much used. Some works in the con- trol law synthesis of non-linear systems have used successfully this structure (Gawthrop, 1995), (Gregorcic and Lightbody, 2000). The M.M. is obtained by local linearisation of the non-linear system around different operating points.
RecentlyVenkat et al. (Venkatet al., 2003) have proposed an identification methodology based on this multiple-model structure for a control appli- cation. The identification of sub-models is based on input/output data that are obtained in one small operating zone of the non-linear system thus assuring the linearity of each set of data. A great number of particular experiments of non-linear system are necessary to generate several sets of data. Indeed, one set of data is necessary for each sub-model identification.
We propose an off-line identification procedure of non-linear systems using an uncoupled multiple- model approach. The identification problem is given in the following statement: the aim is the parameter identification ofLsub-models when the weighting functions µi(ξ) are fixed, on one hand and input data u(k) and output data y(k) of a SISO non-linear system are given on the other one.
3.1 Estimation criteria
Typically three estimation criteria: global, local and combined, are employed in the parameter identification of multiple-model. The choice of es- timation criterion depends on the purpose (appli- cation perspective) of multiple-model.
3.1.1. Global criterion The global criterion is defined by:
JG =1 2
XN
k=1
ε(k)2, (6)
whereN is the number of training data andε(k) is the global error between the M.M. output and the N.L.S. output, given by :
ε(k) = ˆy(k)−y(k). (7)
This criterion encourages a good global charac- terisation between global behaviour of the non- linear system and the multiple-model, but the local behaviour is not considered. This criterion is interesting when the multiple-model is used in a prediction perspective.
3.1.2. Local criterion The local criterion is de- fined by:
JL= 1 2
XL
i=1
XN
k=1
µi(ξ(k))εi(k)2, (8) whereεi(k) is the local error between the ith sub- model output and the N.L.S. output, given by:
εi(k) = ˆyi(k)−y(k). (9) This criterion favours the good local accurate ap- proximation between local behaviour of the sub- models and local behaviour of the non-linear sys- tem. The multiple-model obtained may be used in a phenomenon explication perspective. Indeed the sub-models may be often comparable to a lin- earised models of the non-linear system. However, in comparison to global criterion, a number of sub-models more important is necessary to have a good global characterisation of the non-linear system (Gasso, 2000).
3.1.3. Combined criterion The combined crite- rion defined by (Yenet al., 1998):
JC=αJG+ (1−α)JL, (10) provides an alternative to combining advantages of the two above criteria, then, it is possible to take advantages of each criterion in function of weighting scalar value of α.
3.2 Parametric estimation procedure
In this section, the parameter estimation base on the uncoupled state multiple-model approach is presented. See (Walter and Pronzato, 1997) for complements and general discussion about identification techniques.
The column vector θ is the vector of unknown parameters of the multiple-model represented in the form of apartitioned vectorinLcolumn blocks θp: θ= [θ1. . . θp. . . θL]T (11) where each column block θp is formed by theqp
unknown parameters of the particular sub-model p: θp= [θp,1. . . θp,q. . . θp,qp]T (12) θp,qdenoting the unknown scalar parameter of the sub-modelp.
Here, the parametric estimation ofθp,qis based on an iterative minimisation procedure of a quadratic criterionJ (global, local or combined) with Gauss- Newton’s algorithm:
θ+=θ−H−1G, (13) whereθis the vector of parameters at a particular iteration,θ+this evaluated vector in the following
iteration, H = ∂θ∂θ∂2JT is the Hessian matrix and G =∂J∂θ the gradient vector. The calculus of the gradient vector and the Hessian matrix is based on the calculation of sensitivity functions of out- put multiple-model with respect to sub-models parameters.
Often, Hessian may not be invertible (singular matrix) or the inverse may not be definite positive, then the parameter update is not possible. To avoid these problems, the Marquardt’s algorithm is used for ill-conditioned problems:
θ+=θ−∆(H +λI)−1G, (14) where ∆ is the step size that minimise the cri- terion in the direction of vector H−1G, λ is a scalar and I the identity matrix of appropriate dimension. Ifλis small then the Gauss-Newton’s algorithm is used. On the other hand, if λ is great then the method of gradient descent is used.
Notice that if H is a positive definite matrix then there is always a ∆≤1 that minimises the crite- rion in the direction of H−1G.
The convexity of problem is not ensured then the algorithm solution may not be a global minimum.
Consequently, the choice of an initial guess for the parameter vector is a difficult and delicate step in order to ensure the algorithm convergence.
Sometimes, in difficult cases, several choices of initial parameters may be necessary.
3.3 Parametric estimation with global criterion The gradient GG of JL is given by the simple derivation of the global criterion (6) with respect to parametersθ:
GG= ∂JG
∂θ = XN
k=1
ε(k)∂y(k)ˆ
∂θ , (15) where
∂y(k)ˆ
∂θ = XL
i=1
µi(ξ(k))∂yˆi(k)
∂θ , (16)
∂yˆi(k)
∂θ are the sensitivity functions of the ith out- put with respect to unknown parameters of the multiple-model. The sensitivity functions are de- fined as follows:
∂yˆi(k)
∂θp,q
= ∂Ci
∂θp,q
ˆ
xi(k) +Ci
∂xˆi(k)
∂θp,q
. p= 1,2, ..., L q= 1,2, ..., qp
(17) The state is unknown at the present moment k, the sensitivity functions are calculated at the next momentk+ 1 by derivation of (5) with respect to each parameterθp,q:
∂ˆxi(k+ 1)
∂θp,q
= ∂Ai
∂θp,q
ˆ
xi(k) +Ai
∂xˆi(k)
∂θp,q
+ ∂Bi
∂θp,q
u(k) + ∂Di
∂θp,q
. (18)
Hessian matrixHGis obtained by a double deriva- tion of the global criterion (6) with respect to parameters θ:
HG= ∂2JG
∂θ∂θT = XN
k=1
ε(k)
|{z}
→0
∂2y(k)ˆ
∂θ∂θT+ XN
k=1
∂y(k)ˆ
∂θ
∂y(k)ˆ
∂θT . (19) Here, we have employed the Gauss-Newton’s al- gorithm in order to reduce the costly in compu- tational time. Indeed, the second order derivation term in equation (19) is neglected with the as- sumption that the errorε(k) tends to zero. Hence, an approached Hessian is obtained by sensitivity functions used in the gradient vector calculus:
HG≈ XN
k=1
∂y(k)ˆ
∂θ
∂y(k)ˆ
∂θT , (20)
3.4 Parametric estimation with local criterion The calculation of the gradient vectorGLand the Hessian matrix HL is performed as in the case of the global criterion. By noting that the sensitivity functions ∂yˆ∂θi(k) in the local criterion and global criterion are identical.
Assuming that the weighting functions are not much blended, the distinction between global cri- terion and local criterion disappears; the two ap- proaches of identification offer similar results. On the other hand, if the weighting functions are strongly blended, we will be able to choose the combined criterion (10) in order to weight the sub- models interpretation compared to the quality of the global model.
It is well worth noting that the expressions (16), (17) and (18) are the generics forms often sim- plified in the algorithm implementation. Indeed, these equations may be rewritten as:
∂yˆi(k)
∂θp
= 0 when p6=ii= 1,2, ..., L p= 1,2, ..., L, because the parameters of each sub-model are completely independent (uncoupled structure) of the parameters of the other sub-models.
4. IDENTIFICATION EXAMPLE Let us use the uncoupled structure to mimic the behaviour of the non-linear system:
x(k+ 1) =Ax(k) + sin(γu(k))(β−u(k)),
y(k) =x(k), (21)
withA= 0.95,γ= 0.8πandβ= 1.5.
Here, the identification of the multiple-model is realised with a global criterion. We want a pre- diction multiple-model without interpretation of the local behaviours. The parametric estimation is based on relation (14), the gradient vectorGG
and the Hessian matrix HG are defined by the relations (15) and (20). The multiple-model com- prises arbitrarilyL= 6 sub-models (this quantity may be optimised).
The parameters Ai, Bi, Di and Ci of the sub- models are scalar type (first order sub-model).
The weighting functions µi depend on the in- put signal ξ(k) = u(k), the centres are: ci = [0,0.2,0.4,0.6,0.8,1] and the dispersionσ= 0.2.
4.1 Linearized multiple-model
The multiple-model is obtained by local linearisa- tion of the non-linear system (21) around different operating pointsci. The dynamic behaviour of the linearised multiple-model is shown in figure 2.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0 5 10 15 20 25 30 35 40 45 50
k Sortie S.N.L.
Sortie M.M.
N.L.S.
M.M.
Fig. 2. Non-linear system output and linearised multiple-model output
This new multiple-model structure reveals a new undesirable phenomenon, called unhooking, that produces the ”peaks” in the commutation zones.
On the other hand, using weighting functions strongly blended produces, in this case, an im- portant static error. Indeed, the local characteri- sation of the sub-models is not guaranteed by the weighting functions.
However, the parameters issued from linearisation may be employed as an initial guess for the parameters in the next identification procedure.
4.2 Identified multiple-model
The input u(k) of the system is formed by the concatenation of piecewise constant signals with variable amplitude (u ∈ [0,1]). A set of 5000 input/output data points is used to build the multiple-model. A set of identical dimension in- put/output data points is used to validate the multiple-model. We now employ as the initial parameters those obtained by linearisation of the non-linear system.
The identified multiple model yields a bad mimic of the dynamic behaviour of the non-linear sys- tem (21) (see figure 3). Indeed, the appearance of ”peaks”, in the commutation zones of the con- trol signal, deteriorates considerably the quality of the obtained approximation. The undesirable unhooking phenomenon produces a noise effect in the identification procedure. Next section presents
some explanations and an original solution in or- der to eliminate this undesirable phenomenon.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0 5 10 15 20 25
k Sortie S.N.L.
Sortie M.M. IDENTIFICATION VALIDATION
N.L.S.
M.M.
Fig. 3. Non-linear system output and identified multiple-model output
5. MODIFICATION OF MULTIPLE-MODEL STRUCTURE
The unhooking phenomenon is due to trade-off between the sub-models outputs at commutation moment and the consequence may be strong vari- ations on the global output.
Indeed, let us remember that each sub-model involves independently in their own state space.
Hence, jumps (at commutation moment) between outputs of the sub-models produce sometimes a discontinuous global output of the uncoupled state multiple-model.
Therefore anunhooking phenomenon is presented if the sub-model outputs are significantly different at contribution moment. If the sub-model out- puts are close, at the contribution moment, then the unhooking phenomenon decreases and it is removed if the outputs are identical.
To sum up, theunhooking phenomenon is related to an initial condition problem of the sub-model outputs at the contribution moment (notice that this phenomenon does not exist in the coupled state multiple-model approach). Consequently, in the presence of a unhooking phenomenon the parametric estimation may be deteriorated and the multiple-model identified gives, in the unhook- ing zones, a bad approximation of the N.L.S. (see figure 3).
A first idea in order to avoid this phenomenon is to consider strongly blended weighting functions and a greater number of sub-models. On the other hand, Gawthrop (Gawthrop, 1995) has proposed the incorporation of the local feedback via classi- cal observer theory to each of the sub-models.
We propose an original solution which consists to filter the control signal, the decision signal of µi
and the multiple-model output, by adding three low-pass filters in the multiple-model structure.
The purpose of this approach is to consider grad- ually the contribution of each sub-model and to soften commutation between sub-model outputs.
These three filters are considered as the unknown
parameters in the identification procedure of the multiple-model.
The new multiple-model is defined by:
ˆ
xi(k+ 1) =Aixˆi(k) +Biu(k) +˜ Di, ˆ
yi(k) =Cixˆi(k), (22)
˜ y(k) =
XL
i=1
µi(ξ(k))ˆyi(k), withξ(k) = ˆu(k),
˜
u(k) =F1(q−1)u(k), (23) ˆ
u(k) =F2(q−1)u(k), (24) ˆ
y(k) =F3(q−1)˜y(k), (25) where ˆy is the new multiple-model output.
New sensitivity functions are calculate as in the section 3.3, obviously taking into account the new parameters of the three low-pass filters (here,F1, F2andF3 are first order filters).
Figure 4 shows the new dynamic behaviour of the multiple-model with the three filters. We can notice the good approximation provided by the identified multiple-model. Let us be clear that the phenomenon of unhooking is rejected.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0 2 4 6 8 10 12 14 16 18 20
k Sortie S.N.L.
Sortie M.M. IDENTIFICATION VALIDATION
N.L.S.
M.M.
Fig. 4. Non-linear system output and new identi- fied multiple-model output
6. CONCLUSION
We have investigated a non-linear system identifi- cation using an uncoupled state multiple-model in contrast to classically used coupled state multiple- model (Takagi-Sugeno).
The uncoupled state multiple-model provides a in- teresting alternative in the modelling, the control and the diagnosis of non-linear systems. Indeed, the particularity of this structure is to have the completely independent sub-models. Therefore, one can suppose that analysis tools available for linear systems can be a priori easily applied in contrast to coupled state multiple-model.
On the other hand, an undesirable phenomenon, called unhooking, that deteriorates the quality of the obtained approximation is revealed. An original solution is suggested to avoid this phe- nomenon. The solution is based on the incorpora- tion of three low-pass filters in the identification procedure. An academic example is also provided to demonstrate the good efficiency of this solution.
The proposed identification procedure can be ex- tended to include the optimisation of multiple- model dimension: sub-model orders and their number. It is possible also to consider an optimisa- tion procedure of the weighting functionsµi. The identification of MISO non-linear systems may be also contemplated.
REFERENCES
Filev, D. (1991). Fuzzy modeling of complex sys- tems. International Journal of Approximate Reasoning5, 281–290.
Gasso, K. (2000). Identification des syst`emes dynamiques non-lin´eaires : approche multi- mod`ele. PhD thesis. Institut National Poly- technique de Lorraine.
Gawthrop, P.J. (1995). Continuous-time local state local model networks. In:Proceedings of IEEE Conference on Systems, Man & Cyber- netics. Vancouver, Canada. pp. 852–857.
Gregorcic, G. and G. Lightbody (2000). Control of highly nonlinear processes using self-tuning control and multiple/local model approaches.
In: 2000 IEEE International Conference on Intelligent Engineering Systems, INES 2000.
pp. 167–171.
Gugaliya, J.K., R.D. Gudi and S. Lakshmi- narayanan (2005). Multi-model decomposi- tion of nonlinear dynamics using a fuzzy- CART approach.Journal of Process Control 15(4), 417–434.
Johansen, T. A. and B. A. Foss (1995). Identifi- cation of non-linear system structure and pa- rameters using regime decomposition. Auto- matica31(2), 321–326.
Murray-Smith, R. and T.A. Johansen (1997).
Multiple model approaches to modelling and control. Taylor & Francis.
Takagi, T. and M. Sugeno (1985). Fuzzy identifi- cation of systems and its applications to mod- elling and control. IEEE Trans. on Systems Man and Cybernetics15, 116–132.
Venkat, A.N., P. Vijaysai and R.D. Gudi (2003).
Identification of complex nonlinear processes based on fuzzy decomposition of the steady state space. Journal of Process Control 13(6), 473–488.
Walter, E. and L. Pronzato (1997). Identifica- tion of parametric models: from experimen- tal data. Chap. 4. Springer-Verlag Berlin and Heidelberg GmbH & Co. K.
Yen, J., L. Wang and C.W. Gillespie (1998). Im- proving the interpretability of Takagi-Sugeno fuzzy models by combining global learning and local learning. IEEE Trans. on Fuzzy Systems6, 530–537.