HAL Id: hal-01355016
https://hal.archives-ouvertes.fr/hal-01355016
Submitted on 25 Apr 2019
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Copyright
With Prediction Error Methods: Predictor Input Selection
Arne Dankers, Paul M. J. van den Hof, Xavier Bombois, Peter S. C. Heuberger
To cite this version:
Arne Dankers, Paul M. J. van den Hof, Xavier Bombois, Peter S. C. Heuberger. Identification of Dynamic Models in Complex Networks With Prediction Error Methods: Predictor Input Selection.
IEEE Transactions on Automatic Control, Institute of Electrical and Electronics Engineers, 2016, 61
(4), pp.937-952. �10.1109/TAC.2015.2450895�. �hal-01355016�
Identification of Dynamic Models in Complex Networks with Prediction Error Methods - Predictor Input Selection
Arne Dankers, Member, IEEE,Paul M. J. Van den Hof, Fellow, IEEE, Xavier Bombois and Peter S. C.
Heuberger
Abstract—This paper addresses the problem of obtaining an estimate of a particular module of interest that is embedded in a dynamic network with known interconnection structure. In this paper it is shown that there is considerable freedom as to which variables can be included as inputs to the predictor, while still obtaining consistent estimates of the particular module of interest. This freedom is encoded into sufficient conditions on the set of predictor inputs that allow for consistent identification of the module. The conditions can be used to design a sensor placement scheme, or to determine whether it is possible to obtain consistent estimates while refraining from measuring particular variables in the network. As identification methods the Direct and Two Stage Prediction-Error methods are considered.
Algorithms are presented for checking the conditions using tools from graph theory.
Index Terms—System identification, closed-loop identification, graph theory, dynamic networks, linear systems.
I. INTRODUCTION
S
YSTEMS IN ENGINEERING are becoming more com- plex and interconnected. Consider for instance, power systems, telecommunication systems, and distributed control systems. Since many of these systems form part of the foundation of our modern society, their seamless operation is paramount. However, the increasing complexity and size of the systems poses real engineering challenges (in maintaining sta- bility of the electrical power grid, increasing data throughput of telecommunication networks, etc.). These systems cannot be operated, designed, and maintained without the help of models.Tools fromsystem identificationare well suited to construct models using measurements obtained from a system. However, the field of system identification is primarily focused on iden- tifying open and closed-loop systems. Recently, there has been a move to consider more complex interconnection structures.
The literature on identification and dynamic networks can be split into two categories based on whether the interconnection structure of the network is assumed to be known or not. In the latter the objective is generally to detect the topology of the network, whereas in the former the focus has mainly been to identify (part of) the dynamical transfers in the network based on open-loop and closed-loop identification techniques.
The work of Arne Dankers is supported in part by the National Science and Research Council (NSERC) of Canada.
A. Dankers is with the Delft Center for Systems and Control, Delft University of Technology,[email protected]
P.M.J. Van den Hof is with the Department of Electrical Engineering, Eindhoven University of Technology,[email protected]
P.S.C. Heuberger is with the Department of Mechanical Engineering, Eindhoven University of Technology,[email protected]
X. Bombois is with Laboratoire Amp`ere, Ecole Centrale de Lyon (Ecully, France),[email protected]
The topology detection literature is primarily based on the methods of Granger andGranger Causality[1]. In [2], [3] it is shown that it is possible to distinguish between open and closed-loop systems (using a parametric approach). Recently, this line of reasoning has been extended to more general networks in [4], [5] (using a non-parametric approach). Several methods have appeared that automate Granger’s method for detection of causal relations by using regularization terms to set certain links in the network to zero. For instance, [6], [7]
directly implement an`0norm, whereas [8] uses the LASSO ([9]), and [10] uses a compressed sensing approach. In [11]
a Bayesian approach for topology detection is presented. The main features that these algorithms have in common is that all internal variables in the network are assumed to be known, each internal variable is driven by an independent stochastic variable, and most papers assume all transfer functions in the network are strictly proper. Under these conditions it is shown that topology detection is possible.
Although the structure detection problem is very interesting, the underlying identification techniques, even for the case that the network structure is known, have not been fully developed yet. In particular if we consider situations that go beyond the rather restrictive conditions mentioned above.
As a result, identification of (particular modules in) dynamic networks for a given interconnection structure is a relevant problem to address. Moreover, for a large number of systems in engineering the interconnection structure of the network is known (power systems, telecommunication systems etc.).
In the identification of dynamic networks attention has been given to the study ofspatially distributed systems, where each node is connected only to its direct neighbors and the modules are assumed to be identical [12], [13] or not [14], [15], [16].
In these papers emphasis is on numerically fast algorithms.
In [17] closed-loop prediction-error identification methods have been extended to the situation of dynamic networks and analyzed in terms of consistency properties. The inter- connection topology is very general and goes beyond the spatially distributed topology. The approach is to focus on identifying a single module embedded in a network with known interconnection structure and with general conditions on noise disturbances. Both noise and known user-defined signals (called reference signals) can drive or excite the network, while the presence of reference signals can be used to relax assumptions on the noise in the system. In the analysis of [17] it is required thatallsignals that directly map into the output of the considered module are taken as predictor inputs, and therefore they all need to be measured.
In this paper we consider an extension of the problem
setting in [17]. The objective is to identify a particular module embedded in a dynamic network, and to analyze the flexibility that exists in which selection of measured variables leads to consistent identification of the module of interest. The variables that are measured are available to use as predictor inputs, i.e. variables that are used to predict the value of a particular internal variable. Specifically, the question addressed in this paper is: given a dynamic network with known inter- connection structure, for which selection of predictor inputs can we guarantee that a particular module of interest can be estimated consistently?
Our approach is actually a local approach where only a limited number of variables need to be measured in order to identify the object of interest. The resulting algorithms can be applied to small to medium scale networks, or to large networks with sparse interconnection structures. It can also be used to design a sensor placement scheme tailored specifically to identifying a particular module in the network. Thus, it may be possible to avoid measuring variables that are expensive, difficult or unsafe to measure.
In order to make the step towards a selection of predictor input variables, the dynamics that appear between a selection of measured variables in a network is described in a so-called immersed network. The conditions for consistent module es- timates are derived in a general context, and then specified for the Direct and Two-Stage Prediction-Error Methods, as formalized for a dynamic network case in [17]. This paper is based on the preliminary results of [18], [19] but developed and formulated here in a stronger and unifying framework, by relying predominantly on an analysis that is independent of the particular identification algorithm.
In Section II dynamic networks are defined. In Section III the prediction-error identification framework is presented, including generalizations of the Direct and Two-Stage identifi- cation methods. In Section IV an immersed network is defined as the network that is constructed by discarding nonmeasured node variables. Additionally general conditions are formulated on the predictor input variables to ensure consistent estimation of the module dynamics. In Sections V and VI the conditions on predictor inputs are specified for each identification method separately. In Section VII an algorithm based on graph theory is presented to check the conditions.
II. SYSTEMDEFINITION ANDSETUP
A. Dynamic Networks
The networks that are considered in this paper are built up ofLelements (or nodes), related toLscalarinternal variables wj, j= 1, . . . , L. It is assumed that each internal variable is such that it can be written as:
wj(t) = X
k∈Nj
G0jk(q)wk(t) +rj(t) +vj(t) (1) where G0jk(q), k∈ Nj is a proper rational transfer function, q−1 is the delay operator (i.e.q−1u(t) =u(t−1)) and,
• Nj is the set of indices of internal variables with direct causal connections to wj, i.e.i∈ Nj iffG0ji6= 0;
• vj is an unmeasured disturbance variablethat is a sta- tionary stochastic process with rational spectral density:
vj =Hj0(q)ejwhereej is a white noise process, andHj0 is a monic, stable, minimum phase transfer function;
• rj is an external variable that is known and can be manipulated by the user; it is an important variable that can provide deliberate (user-chosen) excitation to the network.
It may be that the disturbance and/or external variable are not present at some nodes. The entire network is defined by:
w1 w2 ... wL
=
0 G012 · · · G01L G021 0 . .. ...
... . .. . .. G0L−1 L G0L1 · · · G0L L−1 0
w1 w2 ... wL
+
r1 r2 ... rL
+
v1 v2 ... vL
,
whereG0jkis non-zero if and only ifk∈ Nj for rowj. Using an obvious notation results in:
w=G0w+r+v (2) where, w,r, andv are vectors. If an external or disturbance variable is absent at nodei, theith entry ofrorvrespectively is0. Eq. (2) is thedata generating system.
There exists a path from wi to wj if there exist integers n1, . . . , nksuch thatG0jn
1G0n1n2· · ·G0n
kiis non-zero. Likewise there exists a path from ri to wj (or vi towj) if there exist integersn1, . . . , nk such thatG0jn1G0n1n2· · ·G0n
ki is non-zero.
The following sets will be used throughout the paper:
• R and V denote the sets of indices of all external and disturbance variables respectively present in the network.
• Rj andVj denote the sets of indices of all the external and disturbance variables respectively with a path towj. A directed graph of a dynamic network can be used to represent a network. A directed graph is a collection of nodes connected by directed edges. A directed graph of a dynamic network can be constructed as follows:
1. Let allwk,k∈ {1, . . . , L} be nodes.
2. Let allvk,k∈ V andrm,m∈ Rbe nodes.
3. For all i, j ∈ {1, . . . , L} if Gji 6= 0, then add a directed edge from nodewi to nodewj.
4. For all k∈ V add a directed edge fromvk towk. 5. For all k∈ R add a directed edge fromrk towk.
More concepts from graph theory will be used throughout the paper, but they will be presented where they are applicable.
The following is an example of a dynamic network.
Example 1:Consider a network defined by:
w1 w2 w3 w4
w5
w6
=
0 0 0 G014 0 0
G021 0 G023 0 0 0
0 G032 0 0 0 0
0 0 0 0 0 G046
0 G052 0 G054 0 G056 0 0 G063 0 G065 0
w1 w2 w3 w4
w5
w6
+
v1 v2 v3 v4
v5
v6
shown in Fig. 1a. Its graph is shown in Fig. 1b.
All networks are assumed to satisfy the following conditions.
Assumption 1:
w1
G021
w2
G023 G032
w3
G063 w6
G056 G065
w5
G054 w4
G046
G014
G052 v1
v2
v3
v4
v5
v6
w1
w2
w3
w4
w5
w6
v1
v2
v3
v4
v5
v6
(a) (b)
Fig. 1. A diagram (a) and graph (b) of the network for Examples 1 and 2. In (a), each rectangle represents a transfer function, and each circle represents a summation. For clarity labels of the wi’s have been placed inside the summations indicating that the output of the sum is the variablewi.
(a) The network is well-posed in the sense that all principal minors of limz→∞(I−G0(z))are non-zero.
(b) (I−G0)−1 is stable.
(c) All rm,m∈ Rare uncorrelated to allvk,k∈ V.1 The well-posedness property [20] ensures that bothG0and (I−G0)−1 only contain proper (causal) transfer functions, and still allows the occurrence of algebraic loops.
In this paper the set of internal variables chosen as predictor inputs plays an important role. For this reason, it is convenient to partition (2) accordingly. Let Dj denote the set of indices of the internal variables that are chosen as predictor inputs.
Let Zj denote the set of indices not in {j} ∪ Dj, i.e.
Zj = {1, . . . , L} \ {{j} ∪ Dj}. Let wD denote the vector [wk1 · · · wkn]T, where {k1, . . . , kn} = Dj. Let rD denote the vector [rk1 · · · rkn]T, where {k1, . . . , kn} = Dj, and where the`th entry is zero if r` is not present in the network (i.e. ` /∈ R). The vectors wZ, vD, vZ and rZ are defined analogously. The ordering of the elements of wD,vD, andrD
is not important, as long as it is the same for all vectors. The transfer function matrix between wD andwj is denoted G0jD. The other transfer function matrices are defined analogously.
By this notation, the network equations (2) are rewritten as:
wj
wD
wZ
=
0 G0jD G0jZ GD0j GDD0 GDZ0 GZ0j GZD0 GZZ0
wj
wD
wZ
+
vj
vD
vZ
+
rj
rD
rZ
, (3) whereGDD0 andGZZ0 have zeros on the diagonal.
III. PREDICTIONERRORIDENTIFICATION AND
EXTENSION TODYNAMICNETWORKS
In this section, the prediction-error framework is presented with a focus on using the techniques in a network setting.
It is an identification framework based on the one-step-ahead predictor model [21].
1Throughout this paper r uncorrelated to v will mean that the cross- correlation functionRrv(τ)is zero for allτ.
A. Prediction Error Identification
Letwjdenote the variable which is to be predicted, i.e. it is the output of the module of interest. Thepredictor inputsare those (known) variables that will be used to predict wj. The sets Dj andPj are used to denote the sets of indices of the internal and external variables respectively that are chosen as predictor inputs -wk is a predictor input iff k∈ Dj, andrk
is a predictor input iffk ∈ Pj. The one-step-ahead predictor for wj is then [21]:
ˆ
wj(t|t−1, θ) =Hj−1(q, θ) X
k∈Dj
Gjk(q, θ)wk(t)
+X
k∈Pj
Fjk(q, θ)rk(t) +
1−Hj−1(q, θ)
wj(t) (4) where Hj(q, θ) is a monic noise model, Gjk(θ) models the dynamics between wk to wj,k ∈ Dj, and Fjk(q, θ)models the dynamics between rk towj, k∈ Pj. The importance of includingFjk(q, θ)will become evident later in the paper. The prediction error is then:
εj(t, θ) =wj(t)−wˆj(t|t−1, θ)
=Hj(θ)−1
wj−X
k∈Dj
Gjk(θ)wk−X
k∈Pj
Fjk(θ)rk
(5) where arguments q and t have been dropped for notational clarity. The parameterized transfer functionsGjk(θ),k∈ Dj, Fjk(θ),k∈ Pj, and Hj(θ)are estimated by minimizing the sum of squared (prediction) errors:
Vj(θ) = 1 N
N−1
X
t=0
ε2j(t, θ). (6) where N is the length of the data set. Let θˆN denote the minimizer of (6). Under standard (weak) assumptions ([21]) θˆN →θ∗ with probability 1 asN → ∞where
θ∗= arg min
θ∈Θ
E¯[ε2j(·, θ)] and E¯ := lim
N→∞
1 N
N−1
X
t=0
E,
and E is the expected value operator [21]. The function E¯[ε2j(t, θ)]is denoted V¯j(θ). IfGjk(q, θ∗) =G0jk the module transfer is estimated consistently.
As in closed-loop identification, identification in networks may have the problem that the disturbance affecting the “out- put”wjis correlated to one or more of the predictor inputs. In the closed-loop identification literature several methods have been developed to deal with this problem such as the Direct and Two Stage Methods [22], [23], [24]. Both methods have been extended to a network setting [17]. Generalizations of both methods to allow for a flexible choice of predictor inputs are presented in the following subsections.
B. The Direct Method
The Direct Method for identifyingG0ji(q)is defined by the following algorithm.
Algorithm 1:Direct Method.
1. Selectwj as the output variable to be predicted.
2. Choose the internal and external variables to include as predictor inputs (choose Dj andPj).
3. Construct the predictor (4).
4. Obtain estimates Gjk(q,θˆN),k∈ Dj,Fjk(q,θˆN),k∈ Pj andHj(q,θˆN) by minimizing the sum of squared predic- tion errors (6).
In [17] Step 2 of the algorithm is replaced by a fixed choice, namely,Dj=Nj, andPj=∅.
C. Two Stage Method
In the Two Stage Method, the predictor inputs are not internal variables, but projections of internal variables. The projection of wk onto an external variable rk is defined as follows. Any variable wk can be written as:
wk = X
m∈Rk
Fkm0 rm+ X
m∈Vk
Hkm0 vm. (7)
whereFkm0 andHkm0 are proper stable transfer functions. Let wk(rm):=Fkm0 rm. The termw(rkm)is the projection ofwkonto causally time shifted versions ofrm(referred to as simply the projection ofwkontorm). If there are more external variables available, then wk can be projected onto a set of external variablesrm,m∈ Tj, which is denoted by
wk(Tj):= X
m∈Tj
w(rkm)= X
m∈Tj
Fkm0 rm. (8)
An estimate of wk(Tj) can be obtained by estimating Fkm0 , m∈ Tj (using a Prediction-Error Method for instance) using a parametrized model Fkm(q, γ) with γ a parameter vector, resulting in an estimated model Fkm(q,γˆN). This model is used to generate the simulated signal:
ˆ
wk(Tj)(ˆγN) = X
m∈Tj
Fkm(q,ˆγN)rm(t).
The Two Stage Method is defined as follows.
Algorithm 2: Two Stage Method.
1. Select wj as the output variable to be predicted.
2. Choose the external variables to project onto (choose Tj).
3. Choose the internal and external variables to include as predictor inputs (choose Dj andPj).
4. Obtain estimates wˆ(kTj) of wk(Tj) for eachk∈ Dj. 5. Construct the predictor
ˆ
wj(t|t−1, θ) = X
k∈Dj
Gjk(θ) ˆwk(Tj)+ X
k∈Pj
Fjk(θ)rk. (9)
6. Obtain estimates Gjk(q,θˆN),k∈ Dj andFjk(q,θˆN),k∈ Pjby minimizing the sum of squared prediction errors (6).
This algorithm is a generalization of the one in [17].
Remark 1: In Step 5 of the algorithm a noise model is optional. For simplicity it is not included in (9).
IV. CONSISTENT IDENTIFICATION ON THE BASIS OF A SUBSET OF PREDICTOR INPUT VARIABLES
When only a subset of all node variables in a network is available from measurements, a relevant question becomes:
what are the dynamical relationships between the nodes in this subset of measured variables? In Section IV-A it is shown that when only a selected subset of internal variables is considered, the dynamic relationships between these variables can be described by an immersed network. Several properties of the immersed network are investigated. In Section IV-B it is shown under which conditions the dynamics that appear between two internal variables remain invariant when reducing the original network to the immersed one. In Section IV-C the results of identification in networks are characterized. It is shown that it is the dynamics of the modules in the immersed network that are being identified, and conditions for consistency of general identification results are formulated. The results presented in this section are independent of an identification method.
A. The Immersed Network
In this subsection, we show that there exists a unique dynamic network consisting only of a given subset of internal variables, that still exactly describes the dynamics between the selected variables. Moreover, we show that this network can be constructed by applying an algorithm from graph theory for constructing an immersed graph. Given the selected variables wk, k∈ {j} ∪ Dj, the remaining variables wn, n∈ Zj are sequentially removed from the network.
The following proposition shows that there is a unique char- acterization of the dynamics between the selected variables.
Proposition 1: Consider a dynamic network as defined in Section II-A that satisfies Assumption 1. Consider the set of internal variables{wk},k∈ Dj∪ {j}. There exists a network:
wj(t) wD(t)
= ˘G0(q,Dj) wj(t)
wD(t)
+ ˘F0(q,Dj)
rj(t)+vj(t) rD(t)+vD(t) rZ(t)+vZ(t)
, (10) where G˘0 and F˘0 are unique transfer matrices of the form (using a notation analogous to that of (3)):
G˘0=
"
0 G˘0jD G˘D0j G˘DD0
#
and F˘0=
F˘jj0 0 F˘j0Z 0 F˘DD0 F˘DZ0
, (11) whereG˘DD0 has zeros on the diagonal,F˘DD0 is diagonal, and if there is an index ` such that both v` andr` are not present, then the corresponding column ofF˘0 is set to all zeros.
See Appendix X-A for the proof. Proposition 1 is in line with the result of [25] where conditions have been formulated for the existence of a unique interconnection matrix G˘0 on the basis of a transfer function from external inputs to node signals. The conditions in [25] typically reflect that enough entries of G˘0 and F˘0 are known (or set to zero as in our case). Proposition 1 is formulated for a particular structure, that matches with our dynamic network setup. EnforcingG˘0 to have zeros on the diagonal results in a network that does not have any “self-loops”, i.e. no paths that enter and leave the same node. This matches the assumptions imposed on the data
w1
w2
w3
w4
w5
w6
v1
v2
v3
v4
v5
v6
lift paths through
w3
w1
w2
w3
w4
w5
w6
v1
v2
v3
v4
v5
v6
lift paths through
w6
w1
w2
w3
w4
w5
w6
v1
v2
v3
v4
v5
v6
Fig. 2. Example of constructing an immersion graph. In Step 1 internal variable w3 is removed, and in step 2 variable w6. Edges between w’s have been emphasized in thick black lines since these connections define the interconnection structure of the corresponding dynamic network.
generating system (2). Enforcing the leading square matrix of F˘0 to be diagonal results in a network where each rk, k ∈ Dj∪ {j} only has a path to the corresponding internal variable wk (matching the interconnection structure of (2)).
The effect of the remaining external variables is encoded in Fj0Z andFDZ0 without any pre-defined zero entries.
Denote the noise in (10) as:
v˘j
˘ vD
=
F˘jj0 0 0 F˘DD0
vj vD
+
F˘j0Z F˘DZ0
vZ. (12) Then by the Spectral Factorization Theorem [26], there exists a unique, monic, stable, minimum phase spectral factorH˘0:
˘vj
˘ vD
=
"
H˘jj0 H˘j0D H˘D0j H˘DD0
#
˘ ej
˘ eD
. (13)
where[˘ej ˘eDT]T is a white noise process.
In the following text it is shown that a network of the form (10) can be constructed using ideas from graph theory.
In graph theory, one way to remove nodes from a graph is by constructing animmersedgraph. A graph G0 is an immersion of G if G0 can be constructed from G by lifting pairs of adjacent edges and then deleting isolated nodes [27]. Lifting an edge is defined as follows. Given three adjacent nodes a, b,c, connected by edgesab andbc, the lifting of path abcis defined as removing edgesabandbcand replacing them with the edge ac. In Fig. 2 an immersed graph of the network of Example 1 is constructed by first removing the node w3 and connecting v3 to w2 andw6, and subsequently removing w6 and connecting v6 tow5 andw4.
In this way an immersed network can be constructed by an algorithm that manipulates the dynamics of the network iteratively. To keep track of the changes in the transfer functions iteratively, let G(i)mn and Fmn(i) denote the transfer functions of the direct connections wn to wm and from rn
andvn towm, respectively, at iterationi of the algorithm.
Algorithm 3: Constructing an immersed network.
1. Initialize. Start with the original network:
• G(0)mn=G0mn for allm, n∈ {1, . . . , L}, and
• Fkk(0)= 1, for allk∈ R ∪ V,Fmn(0) = 0otherwise.
2. Remove eachwk,k∈ Zj from the network, one at a time.
Letd=card(Zj). LetZj={k1, . . . , kd}.
fori= 1 :d
(a) Let Iki denote the set of internal variables with edges to wki. Let Oki denote the set of nodes with edges from wki. Lift all paths wn → wki →wm, n∈ Iki, m∈ Oki. The transfer function of each new edge from wn →wm isG(i)mn=G(i−1)mk
i G(i−1)k
in . (b) LetIkr
i denote the set of external or disturbance vari- ables with edges towki. Lift all paths rn → wki → wm,n∈ Ikr
i,m∈ Oki. The transfer function for each new edge fromrn→wmis Fnm(i) =Fnk(i−1)
i G(i−1)k
in . (c) If there are multiple edges between two nodes, merge
the edges into one edge. The transfer function of the merged edge is equal to the sum of the transfer functions of the edges that are merged.
(d) remove the nodewki from the network.
end
3. Remove all self-loops from the network. If node wm has a self loop, then divide all the edges enteringwm by(1− G(d)mm(q))(i.e. one minus the loop transfer function).
Let G˘i0 and F˘i0 denote the final transfer matrices of the immersed network.
Remark 2:Algorithm 3 has a close connection to Mason’s Rules [28], [29]. However, Mason was mainly concerned with the calculation of the transfer function from thesources (external and noise variables) to asink(internal variable). This is equivalent to obtaining the immersed network withDj =∅, i.e. all internal variables except one are removed. Importantly, Algorithm 3 is an iterative algorithm which allows for easy implementation (even for large networks), whereas Mason’s rules are not iterative and complicated even for small networks.
w1
G021
w2
G023 G032
w3
G063
w6
G056
G065
w5
G054 w4
G046
G014
G052 v1
v2
v3
v4
v5
v6
w1
G˘021
w2
G˘052 w5
G˘054 G˘045 w4 G˘014
G˘042
˘ v1
˘ v2
˘ v4
˘ v5
(a) (b)
Fig. 3. (a)Original dynamic network considered in Example 2.(b)Immersed network withw3andw6removed.
Example 2:Consider the dynamic network shown in Figure 3a. The graph of this network is shown in the first graph of Fig.
2. Supposew3andw6are to be removed from the network (i.e.
Zj={3,6}). By Algorithm 3 the network shown in Figure 3b
results. The transfer functions of the immersed network are:
G˘i0(q,Dj)=
0 0 G014 0
G021
1−G023G032 0 0 0
0 G032G046G063 0 G046G065 0 G0521−G+G0560G063G032
56G065
G054
1−G056G065 0
F˘i0(q,Dj)=
1 0 0 0 0 0
0 1−G10
23G032 0 0 1−GG0023
23G032 0 0 0 1 0 G046G063 G046 0 0 0 1−G10
56G065
G056G063 1−G056G065
G056 1−G056G065
.
Note that the immersed network (shown in Figure 3b) is represented by the last graph shown in Figure 2.
Interestingly, the matrix F˘i0 in Example 2 has the same structure as that of F˘0 in Proposition 1. This alludes to a connection between the network characterized in (10) and immersed networks as defined by Algorithm 3.
Proposition 2: The matrices G˘0 and F˘0 of the network characterized by (10) and the matrices G˘i0 andF˘i0 defined
by Algorithm 3 are the same.
The proof is in Appendix X-B. Since, by Proposition 2 the matrices in (10) are the same as those of the immersed network, the superscript i will be dropped from this point on in the matrices defined by Algorithm 3. An important consequence of Proposition 2 is that (by Proposition 1) the immersed network is unique.
Instead of calculating the matrices of the immersed network iteratively, it is also possible to derive analytic expressions for the matrices G˘0andF˘0.
Proposition 3: Consider a dynamic network as defined in (2) that satisfies Assumption 1. For a given set {j} ∪ Dj the transfer function matricesG˘0andF˘0of the immersed network are:2
"
0 G˘0jD G˘D0j G˘DD0
#
= 1−G˜jj
I−diag( ˜GDD0 )
−1 0 G˜0jD G˜D0j G˜DD0 −diag( ˜GDD0 )
F˘jj0 0 F˘j0Z 0 F˘DD0 F˘DZ0
= 1−G˜jj
I−diag( ˜GDD0 ) −1
1 0 F˜j0Z 0 I F˜DZ0
where G˜jj G˜jD
G˜Dj G˜DD
=
0 G0jD GD0j GDD0
+ G0jZ
GDZ0
(I−GZZ0 )−1
GZ0j GZD0 ,
F˜jZ
F˜DZ
= G0jZ
GDZ0
(I−GZZ0 )−1.
The proof is in Appendix X-C. The transfer functionsG˜mn correspond to G(d)mn in Step 3 of Algorithm 3.
The immersed network inherits some useful properties from the original network.
Lemma 1: Consider a dynamic network as defined in (2) that satisfies Assumption 1 and a given set {j} ∪ Dj. 1. Consider the paths from wn to wm, n, m∈ Dj that pass
only through nodesw`,`∈ Zj in the original network. If
2The argumentsq orDj (or both) of G˘0jk(q,Dj) andF˘jk0(q,Dj)are sometimes dropped for notational clarity.
w1
G021 G012 w2
G031 w3
G023
G041 w4
G054 w5 G025
G015
v1 r1
v2
v3
v4
v5
r5
Fig. 4. Network analyzed in Examples 3 and 7.
all these paths andG0mn(q) have a delay (are zero), then G˘0mn(q,Dj)has a delay (is zero).
2. Consider the paths fromrn towm(orvn towm),n∈ Zj, m∈ Dj. If all these paths pass through at least one node w`,`∈ Dj thenF˘mn0 (q,Dj) = 0.
For a proof see Appendix X-D.
B. Conditions to EnsureG˘0ji(q,Dj) =G0ji(q)
A central theme in the previous section was that the transfer function G˘0ji(Dj) in the immersed network may not be the same as the transfer function G0ji in the original network.
In other words, by selecting a subset of internal variables to be taken into account, the dynamics between two internal variables might change. In this section conditions are presented under which the module of interest,G0ji, remains unchanged in the immersed network, i.e.G˘0ji(q,Dj) =G0ji(q).
The following two examples illustrate two different phe- nomena related to the interconnection structure that can cause the dynamicsG˘0ji(q,Dj)to be different fromG0ji(q).
Example 3:Consider the dynamic network
w1
w2 w3 w4 w5
=
0 G012 0 0 G015 G021 0 G023 0 G025 G031 0 0 0 0 G041 0 0 0 0
0 0 0 G054 0
w1
w2 w3 w4 w5
+
v1+r1
v2 v3 v4 v5+r5
.
shown in Fig. 4. The objective of this example, is to chooseD2
such that in the immersed network G˘021(D2) =G021 (denoted in green). A key feature of the interconnection structure in this example is that there are multiple paths fromw1 tow2: w1→w2,w1→w3→w2,w1→w4→w5→w2, etc..
Start by choosingD2={1}, then by Proposition 3, G˘021(q,{1}) =G021(q)+G023(q)G031(q)+G025(q)G054(q)G041(q).
Two of the terms comprising this transfer function correspond to the two paths fromw1tow2that pass only throughwk,k∈ Z2 (Z2={3,4,5}). From Algorithm 3 this is not surprising since the paths G023G031 and G025G054G041 must be lifted to remove the nodesw3, w4 andw5 from the original network.
Clearly, for this choice ofD2,G˘021(D2)6=G021. Now choose D2={1,5}. By Proposition 3
G˘021(q,{1,5}) =G021(q) +G023(q)G031(q).
Again, one of the terms comprisingG˘021(q,{1,5})corresponds to the (only) path fromw1tow2that passes only throughwk, k∈ Z2 (Z2={3,4}).
Finally, choose D2 = {1,3,5}. By Proposition 3 G˘021(q,{1,3,5}) =G021(q)as desired. Note that for this choice of D2 every path exceptG021 from w2 tow1 is ”blocked” by
a node inD2.
In general, one internal variable wk from every indepen- dent path wi to wj must be included in Dj to ensure that G˘0ji(q,Dj) =G0ji(q). This is proved later in Proposition 4.
However, before presenting the proposition, there is a sec- ond phenomenon related to the interconnection structure of the network that can cause the dynamicsG˘0ji(q,Dj)to be different from G0ji(q), as illustrated in the next example.
w1
G021
G012 w2
G032
G023 w3
v1
r1
v2
v3
Fig. 5. Network that is analyzed in Example 4.
Example 4: Consider the network shown in Fig. 5. The objective of this example, is to choose D2 such that in the immersed network G˘021(D2) =G021(denoted in green).
Note that in this network there is only one independent path from w1 tow2. ChooseD2={1}. By Proposition 3
G˘021(q,{1}) = G021(q) 1−G023(q)G032(q)
which is not equal toG021(q)as desired. The reason the factor
1
1−G023G032 appears is because when lifting the pathG23G32a self-loop fromw2tow2results. Thus, in step 3 of Algorithm 3 the transfer functions of the edges coming intow2are divided by the loop transfer function.
For the choice D2={1,3}, G˘021({1,3}) =G021 as desired.
Note that in for this choice of D2, all paths fromw2 to w2
are ”blocked” by a node in D2.
In general, ifDj is chosen such that no self-loops fromwj towj result due to the lifting of the paths when constructing the immersed network, the denominator in Step 3 of Algorithm 3 is reduced to 1. From these two examples we see that:
• Every parallel path fromwi towj should run through an input in the predictor model, and
• Every loop on the outputwj should run through an input in the predictor model.
This is formalized in the following proposition.
Proposition 4: Consider a dynamic network as defined in Section II-A that satisfies Assumption 1. The transfer function G˘0ji(q,Dj)in the immersed network is equal to G0ji(q)ifDj
satisfies the following conditions:
(a) i∈ Dj,j /∈ Dj,
(b) every pathwitowj, excluding the pathG0ji, goes through a nodewk,k∈ Dj,
(c) every loop wj towj goes through a nodewk,k∈ Dj. The proof is in Appendix X-E. The formulated conditions are used to make appropriate selections for the node variables that are to be measured and to be used as predictor inputs. In the following section it is shown that it is possible to identify the dynamics of the immersed network.
C. Estimated Dynamics in Predictor Model
In this section it is shown that the estimated dynamics between the predictor inputs and the module output wj, are equal to G˘0jk(Dj). The result confirms that the estimated dynamics are a consequence of the interconnection structure and the chosen predictor inputs. In addition conditions are presented that ensure that the estimates of G˘0jk(Dj) are consistent. The results in this section are not specific to a particular identification method.
To concisely present the result, it is convenient to have a notation for a predictor which is a generalization of both the Direct and Two Stage Methods. Consider the predictor
ˆ
wj(t|t−1, θ) =Hj−1(q,θ) X
k∈Dj
Gjk(q,θ)w(X)k (t)
+ X
k∈Pj
Fjk(q,θ)rk(t) +
1−Hj−1(q,θ)
wj(t) (14) where X denotes a (sub)set of the variables rk, vk, k ∈ {1, . . . , L}. Note that both predictors (4) and (9) are special cases of the predictor (14). For the Direct Method, choose X = {rk1, . . . , rkn, v`1, . . . , v`n}, where {k1, . . . , kn} = R, and {`1, . . . , `n} = V. Then wk(X) =wk. For the Two Stage Method, choose X ={rk1, . . . , rkn}, where {k1, . . . , kn} = Tj.
A key concept in the analysis of this section is theoptimal output error residual, which will be discussed next. From (10), wj can be expressed in terms ofwk,k∈ Dj as
wj=X
k∈Dj
G˘0jkwk+X
k∈Zj∩Rj
F˘jk0rk+X
k∈Zj∩Vj
F˘jk0vk+vj+rj. (15)
Note that by Lemma 1 someF˘jk0(q,Dj)may be zero depend- ing on the interconnection structure. Letwk be expressed in terms of a component dependent on the variables in X, and a component dependent on the remaining variables, denoted wk=w(X)k +w(⊥X)k . In addition, split the sum involving the rk-dependent terms according to whether rk is in Pj or not.
Then, from (15):
wj=X
k∈Dj
G˘0jkwk(X)+X
k∈Dj
G˘0jkw(⊥X)k +X
k∈Pj
F˘jk0rk
+ X
k∈((Zj∪{j})∩Rj)\Pj
F˘jk0rk+ X
k∈Zj∩Vj
F˘jk0vk+vj. (16)
When choosing an Output Error predictor (i.e.Hj(q, θ) = 1), with predictor inputs wk(X),k∈ Dj andrk,k∈ Pj, the part of (16) that is not modeled can be lumped together into one term. This term is theoptimal output error residualofwj, and is denotedpj:
pj(Dj) := X
k∈Dj
G˘0jkw(⊥Xk )+ X
k∈((Zj∪{j})∩Rj)\Pj
F˘jk0rk+ ˘vj, (17)
wherev˘jis given byP
k∈Zj∩Vj
F˘jk0vk+vjin accordance with (12). Consequently,wj equals:
wj = X
k∈Dj
G˘0jkw(X)k + X
k∈Pj
F˘jk0rk+pj. (18)