A DBN representation - The DART-Europe E-theses Portal

LetP ={1≤i≤p} describe the set of observed genes andN ={1≤t≤n} the space of observation times. In this paper, we consider a discrete-time stochastic processX ={X_tⁱ;i∈P, t∈N} taking real values and assume the joint probability distributionPof the processX has densityf with respect to Lebesgue measure onR^p^×ⁿ. We denote byXt={X_tⁱ;i∈P}the set of theprandom variables observed at timetandX1:t={X_sⁱ;i∈P, s≤t}the set of the random variables observed before timet.

In this section, we expose sufficient conditions under which the probability distribution P admits a BN rep-resentation according to a dynamic network (e.g. in Figure 3.2). The main result is set out in Proposition 5;

we show that it exists a BN representation according to a minimal DAG ˜G whose edges describe exactly the set of direct dependencies between successive variables X_t−1^j , X_tⁱ given the past of the process. For an illustration, minimal DAG ˜GAR(1) is given in the particular case of an AR(1) model in subsection 3.2.4. The main interest of a DBN representation is to derive conditional dependence relationships between the variables by using the graphical theory associated with the DAGs. Note that, even though we need to consider a homogeneous DBN for the in-ference of gene interaction networks, the general framework (sections 3.2 and 3.3) is developed without assuming homogeneity.

Table 3.1: Notations P ={1≤i≤p} set of observed genes,

Pi=P\{i}

N ={1≤t≤n} time space,

X ={X_tⁱ;i∈P, t∈N} stochastic process (gene expression levels time series),

G= (X, E(G)) a DAG whose vertices are defined by X and edges byE(G)⊆X×X, G˜ the ”true” DAG describing full order conditional dependencies, G^(q) q^th order conditional dependence DAG,

3.2.1 Backgrounds

LetG= (X, E(G)) be a DAG whose vertices are the variablesX={X_tⁱ;i∈P, t∈N}and whose set of edgesE(G) is a subset ofX×X. We quickly recall here elements of the theory of graphical models associated with the DAGs [Lau96].

Definition 3 The parents of a vertex X_tⁱ in G, denoted by pa(X_tⁱ,G), are the variables having an edge pointing towards the vertexX_tⁱ in G,

pa(X_tⁱ,G) :={X_s^j such that(X_s^j, X_tⁱ)∈E(G);j∈P, s∈N}.

Proposition 3 (BN representation [Pea88])The probability distributionPof the processX admits a Bayesian Network representation according to DAGGwhenever its densityf factorizes as a product of the conditional density of each variableX_i^tgiven its parents inG,

f(X) =Y

i∈P

t∈N

f(X_tⁱ|pa(X_tⁱ,G)).

Definition 4 (Moral graph) The moral graphG^m of DAGG is obtained fromG by first ’marrying’ the parents (draw an undirected edge between each pair of parents of each variable X_tⁱ) and then deleting directions of the original edges ofG.

For an illustration, Figure 3.3 exposes the moral graph of the DAG in Figure 3.2.

Definition 5 (Ancestral set) The subset S is ancestral if and only if, for all α ∈S, the parents of αsatisfy pa(α,G)⊆S. Hence, for any subset S of vertices, there is a smallest ancestral set containing S which is denoted byAn(S). ThenGAn(S) refers to the graph of the smallest ancestral setAn(S). See Figure 3.4 for an illustration.

Throughout this paper, a central notion is that of conditional independence of random variables. LetP_U,V,W be the joint distribution of three random variables (U, V, W). We say that U is conditionally independent of V given W underP_U,V,W and writeU ⊥⊥V | W whenever the variableU does not depend onV when considering

3.2. A DBN REPRESENTATION 45

Figure 3.3: Moral graph of the DAG in Figure 3.2. For allt >1, the parents of the variable X_t¹ are “married”, that is connected by a non directed edge.

the joint distributionP_U,V,W. This result generalizes to sets of disjoint variables. Such conditional independence relationships can be set from a BN representation by using the graphical theory associated with the DAGs. Most of the results are based on the next proposition which is derived from the Directed global Markov property [Lau96].

Proposition 4 (Lauritzen [Lau96], Corollary 3.23)LetPadmit a BN representation according to G. Then,

E⊥⊥F | S,

whenever all paths from E to F intersect S in (GAn(E∪F∪S))^m, the moral graph of the smallest ancestral set containingE∪F∪S. We say thatS separatesE fromF.

3.2.2 Sufficient conditions for a DBN representation

Assumption 1 The stochastic process Xtis first-order Markovian,

∀t≥3, Xt⊥⊥X_1:t−2 | X_t−1.

Assumption 2 For all t ≥1, the random variables{X_tⁱ}i∈P are conditionally independent given the past of the process X_1:t−1, that is,

∀t≥1,∀i6=j, X_tⁱ⊥⊥X_t^j | X1:t−1.

We first assume that the observed processXtis first-order Markovian (Assumption 1). That is the expression level of a gene at given timet only depends on the past through the expression level at the previous time t−1.

Then we assume that the variables observed simultaneously are conditionally independent given the past of the process (Assumption 2). In other words, we consider that time measurements are close enough so that a gene expression level X_tⁱ measured at time t is better explained by the previous time expression levels X_t−1 than by some current expression levelX_t^j.

From Assumptions 1 and 2, we establish in Lemma 1 the existence of a DBN representation of the distribution P according to DAGG^full which contains all the edges pointing out from a variable observed at some time t−1 towards a variable observed at next time t. The direction of the edges according to the time guarantees the acyclicity ofG^full.

Lemma 1 Under Assumptions 1 and 2, the probability distributionPadmits a DBN representation according to a DAG whose edges only join nodes representing variables observed at two successive time points, at least according to the DAGG^full = (X,{(X_t−1^j , X_tⁱ)}i,j∈P,t>1)which has edges between any pair of successive variables.

Proof. From assumption 1, the densityf of the joint probability distribution of the processX writes as the product of conditional densities,

From Assumption 2, for all t >1, the conditional densityf(Xt|Xt−1) writes as the product of the conditional density of each variableX_tⁱ given the set of variablesXt−1 observed at the previous time,

46 CHAPTER 3. INFERRING DBN WITH LOW ORDER INDEPENDENCIES

f(Xt|X_t−1) =Y

i∈P

f(X_tⁱ|X_t−1). (3.3)

From equations (3.2) and (3.3), the densityf writes as the product of the conditional density of each variable X_tⁱ given its parents in G^full. From Proposition 3, the probability distribution P admits a BN representation according toG^full.

Figure 3.4: Moral graph of the smallest ancestral set containing the variablesX_t+1¹ , its parents (X_t¹, X_t²) in the DAG in Figure 3.2 andX_t². As the set (X_t¹, X_t²) blocks all paths betweenX_t³andX_t+1¹ , we haveX_t+1¹ ⊥⊥X_t³|(X_t¹, X_t²).

3.2.3 Minimal DAG G ˜

Lemma 2 Assume the joint probability distribution P of the process X has density f with respect to Lebesgue measure on R^p^×ⁿ. If P factorizes according to two different subgraphs of G^full, G¹ and G², then P factorizes according toG¹∩ G².

Lemma 3 (Conditional independence between non adjacent successive variables) LetG be a subgraph of G^full according to which the probability distribution P admits a BN representation. For any pair of successive variables (X_t^j₋₁, X_tⁱ)which are non adjacent in G, we have

X_tⁱ⊥⊥X_t^j₋₁ | pa(X_tⁱ,G) and X_tⁱ⊥⊥X_t^j₋₁ | pa(X_tⁱ,G)∪S, for allS subset of {X_u^k;k∈P, u < t}.

The proof of these two lemmas is in Appendix. For an illustration of Lemma 3, assumePadmits a BN representation according to the DAG of Figure 3.2. There is no edge betweenX_t³andX_t+1¹ in this DAG. Now consider in Figure 3.4 the moral graph of the smallest ancestral graph containingX_t³, X_t+1¹ and the parents (X_t¹, X_t²) ofX_t+1¹ . The set (X_t¹, X_t²) blocks all paths betweenX_t³ andX_t+1¹ . From Proposition 4, we haveX_t+1¹ ⊥⊥X_t³ |pa(X_t+1¹ ,G).

It follows directly from Lemma 2 that, among the DAGs included inG^full, it exists a minimal DAG, denoted by G˜, according to which the probability distributionP factorizes. From Lemma 3, the set of edges of ˜G is exactly the set of full order conditional dependencies given the past of the process as set up in the next proposition.

LetPj=P\{j}. We denote byX_t^P^j ={X_t^k;k∈Pj} the set ofp−1 variables observed at timet.

Proposition 5 (BN representation according to G˜, the smallest subgraph of G^full) Whenever Assump-tions 1 and 2 are satisfied, the probability distribution P admits a BN representation according to DAG G˜whose edges describe exactly the full order conditional dependencies between successive variables X_t^j₋₁ andX_tⁱ given the remaining variablesX_t^P₋^j₁ observed at timet−1,

Moreover, DAGG˜is the smallest subgraph ofG^full according to which P admits a BN representation.

See Proof in Appendix. In DAG ˜G, the set of parentspa(X_tⁱ,G˜) of each variableX_tⁱ is the smallest subset ofXt−1

such that the conditional densities satisfyf(X_tⁱ|pa(X_tⁱ,G˜)) =f(X_tⁱ|Xt−1). The set of parents of each variable can be seen as the only variables on which this variable depends directly. So ˜Gis the DAG we want to infer to recover potential regulation relationships from gene expression time series. From Lemma 3, any pair of successive variables (X_t^j₋₁, X_tⁱ) which are non adjacent in ˜G are conditionally independent given the parents ofX_tⁱ,

X_tⁱ⊥⊥X_t−1^j |pa(X_tⁱ,G˜).

We will make use of this result in section 3.3 in order to define low order conditional independence DAGs for the inference of ˜G.

Dans le document The DART-Europe E-theses Portal (Page 45-48)