Appendix: some additional proofs - The DART-Europe E-theses Portal

Acknowledgments

I would like to thank Catherine Matias and Bernard Prum for many stimulating and constructive discussions on this work.

3.6 Appendix: some additional proofs.

Proof of Lemma 2. Consider a discrete-time stochastic processX={X_tⁱ; i∈P, t∈N}whose joint probability Pdistribution has the densityf with respect to Lebesgue measure on R^p^×ⁿ.

Let G¹ and G² be two different subgraphs of G^full according to which the joint probability distribution P factorizes. LetiinP,tin N, we consider the random variableX_tⁱ.

We denote as follows,

the following subsets ofP, pa1={j∈P;X_t^j₋₁∈pa(X_tⁱ,G¹)} pa₁=P\{pa1}

pa2={j∈P;X_t^j₋₁∈pa(X_tⁱ,G²)} pa₂=P\{pa2}

and the densities of the joint or marginal probability distributions of (X_tⁱ, Xt−1), g:R^p+1→Rthe density of the joint probability distribution of (X_tⁱ, Xt−1), gⁱ the density of the probability distribution ofX_tⁱ,

g^P the density of the joint probability distribution of (Xt−1),

g^i,pa¹ the density of the joint probability distribution of (X_tⁱ, X_t−1^pa¹) = (X_tⁱ, pa(X_tⁱ,G¹)),

g^i,pa² the density of the joint probability distribution of (Xtⁱ, X_t^pa₋²₁) = (Xtⁱ, Xt−1\{pa(Xtⁱ,G²)}), etc...

In the following, y ∈ R, x = (x1, ..., xp) ∈ R^p and we denote by xpa1 = {xj;j ∈ pa1} ∈ R^|pa¹^| (Thus x= (xpa1, xpa₁) = (xpa2, xpa₂)∈R^p). As the probability distributionPfactorizes according to G¹, we derive from the DAG theory the conditional independence,

X_tⁱ⊥⊥X_t^pa₋¹₁|X_t^pa₋¹₁, that is,

∀y∈R,∀x∈R^p, g(y, x)

g^P(x) = g^i,pa¹(y, xpa1) g^pa¹(xpa1) . Equivalent results can be derived from the factorization according to G² giving,

∀y∈R, x∈R^p, N g^i,pa²(y, xpa2) =g^i,pa¹(y, xpa1)

g^pa¹(xpa1) g^pa²(xpa2).

By taking the integral with respect to xpa2∩pa1, we write for ally∈R, for allxpa1∪pa2 ∈R^|^pa¹^∪^pa²^|, Z

g^i,pa²(y, xpa2)d(xpa2∩pa1) =

Z g^i,pa¹(y, xpa1)

g^pa¹(xpa1) g^pa²(xpa2)d(xpa2∩pa1) g^i,pa¹^∩^pa²(y, xpa1∩pa2) =g^i,pa¹(y, xpa1)

g^pa¹(xpa1) g^pa¹^∩^pa²(xpa1∩pa2) Finally we have,

∀y∈R,∀x∈R^p, g(y, x)

g^P(x) =g^i,pa¹^∩pa²(y, xpa1∩pa2) g^pa¹^∩^pa²(xpa1∩pa2) ,

that is the conditional density of the probability distribution of Xtⁱ given Xt−1 is the conditional density of the probability distribution ofX_tⁱ givenX_t−1^pa¹^∩^pa². ThenPfactorizes according toG¹∩ G².

58 CHAPTER 3. INFERRING DBN WITH LOW ORDER INDEPENDENCIES

Proof of Lemma 3. Assume P admits a BN representation according to G, a subgraph of G^full. Let X_t−1^j and X_tⁱ be twonon adjacent vertices of G (there is no edge between them in G) and consider the moral graph (GAn(^Xtⁱ∪X_t−1^j ∪pa(X_tⁱ,G)))^mof the smallest ancestral set containing the variablesX_tⁱ,X_t^j₋₁and the parentspa(X_tⁱ,G) ofX_tⁱ inG. As DAGG is a subgraph ofG^full, the set of parentspa(X_tⁱ,G) blocks all paths between X_t^j₋₁ andX_tⁱ in the moral graph (GAn(^Xtⁱ∪X_t−^j 1∪pa(Xtⁱ,G)))^m. From Proposition 4, this establishes the conditional independence X_tⁱ⊥⊥X_t^j₋₁ |pa(X_tⁱ,G).

This result holds for the conditioning according to any subsetS⊆ {X_u^k;k∈P, u < t}. Proof of Proposition 5.

First, we show that P admits a BN representation according to ˜G. Let i, j ∈ P such that X_tⁱ ⊥⊥X_t−1^j |X_t−1^P^j , then we have,

f(X_tⁱ|X_t−1) =f(X_tⁱ|X_t^P₋^j₁).

Under Assumptions 1 and 2, from Lemma 1 and Proposition 3,P admits a BN representation according to the DAG (X, E(G^full) \ (X_t^j₋₁, X_tⁱ)) which has the edges of G^full except for the edge (X_t^j₋₁, X_tⁱ). This holds for any pair of successive variables that are conditionally independent. Consequently, from Lemma 2, P admits a BN representation according to the intersection of the DAG (X, E(G^full) \ (X_t^j₋₁, X_tⁱ)) for any pair (X_tⁱ, X_t^j₋₁) such thatX_tⁱ⊥⊥X_t^j₋₁|X_t^P₋^j₁, that is DAG ˜G.

Second, DAG ˜G cannot be reduced. Indeed, let (X_t^l₋₁, X_t^k) be an edge of ˜G and assume that Padmits a BN representation according to ˜G\(Xt^l−1, Xt^k), that is ˜G reduced from the edge (Xt^l−1, Xt^k). From Lemma 3, we have X_t^k ⊥⊥X_t^l₋₁|X_t^P₋^l₁, which contradicts (X_t^l₋₁, X_t^k)∈V( ˜G) (i.e. X_t^k 6⊥⊥X_t^l₋₁|X_t^P₋^l₁).

Proof of Proposition 7.

First, from Corollary 1, ˜G ⊇ G⁽¹⁾.

Second, letXbe a Gaussian process and (X_t−1^j , X_tⁱ)∈E( ˜G), then according to Proposition 5,X_tⁱ6⊥⊥X_t−1^j |X_t−1^P^j . SinceX is Gaussian, this implies Cov(X_tⁱ, X_t^j₋₁|X_t^P₋^j₁)6= 0.

Now assume that it exists k 6= j, such that X_tⁱ ⊥⊥ X_t^j₋₁ | X_t^k₋₁ ie (X_t^j₋₁, X_tⁱ) ∈/ E(G⁽¹⁾). We are going to prove that this contradictsCov(X_tⁱ, X_t^j₋₁|X_t^P₋^j₁)6= 0. Letl be an element ofP\{j, k}. The conditional covariance Cov(ij|k, l) =Cov(X_tⁱ, X_t^j₋₁ |X_t−1^k , X_t−1^l ) writes,

=Cov(X_tⁱ, X_t−1^j |X_t^k₋₁) ×

1− (Cov(X_t−1^j , X_t^l₋₁|X_t^k₋₁))² V ar(X_t^j₋₁|X_t^k₋₁)V ar(X_t^l₋₁|X_t^k₋₁)

− Cov(X_t^j₋₁, X_t^l₋₁|X_t^k₋₁)Cov(X_tⁱ, X_t^l₋₁ |X_t^k₋₁, X_t^j₋₁) V ar(X_t^l₋₁|X_t^k₋₁) . However both terms in the latter expression ofCov(ij|k, l) are null:

sinceX_tⁱ⊥⊥X_t^j₋₁ |X_t−1^k , thenCov(X_tⁱ, X_t^j₋₁ |X_t−1^k ) = 0,

asN_pa^Max( ˜G)≤1,X_t−1^j is the only parent ofX_tⁱin ˜G. So the variableX_t−1^j and thus also the set (X_t−1^j , X_t−1^k ) blocks all paths betweenX_t^l₋₁andX_tⁱin the moral graph of the smallest ancestral set containingX_tⁱ∪X_t^j,k,l₋₁. Then we have,X_tⁱ⊥⊥X_t^l₋₁ | n

X_t^j₋₁, X_t^k₋₁o

, that isCov(X_tⁱ, X_t^l₋₁ |X_t^k₋₁, X_t^j₋₁) = 0.

Then Cov(ij|k, l) = 0. By induction, we obtain Cov(X_tⁱ, X_t−1^j |X_t−1^P^j ) = 0 leading to a contradiction with (X_t−1^j , X_tⁱ)∈E( ˜G). Therefore (X_t−1^j , X_tⁱ)∈ G⁽¹⁾ and ˜G ⊆ G⁽¹⁾.

Chapter 4

Inferring time-dependent networks from Systems Biology Time Course Data

with reversible jump MCMC.

This chapter introduces a work motivating the very soon submission of an article written in collaboration with Ga¨elle Lelandais.

4.1 Introduction

4.1.1 Background

Most of the time series data analysis aim at observing a chronological process, that is a process that can be divided into several phases. The yeast cell-cycle is an excellent illustration of this phenomenon. Indeed, the available time series [SSZ⁺98] cover the whole cycle which can at least be divided into 4 phases (G1, S, G2, M). Consequently the effective regulation relationships may change along a single process and so do the topology of the network representing these relationships across time.

The phases of the yeast cell-cycle are well known, but this is not the case of most of the biological processes under study. Can we manage to detect phases if they are a priori unknown? Up to now, many procedures have been proposed to infer dynamic networks from time series data [KIM03, BFG⁺05, ORS07, Leb07], but the underlying network is assumed to be homogeneous across time. We propose here to infer a time-dependent network from temporal data, that is a model which allows the relationships between genes to vary across the whole process. Of course, the approach requires more data than to infer homogeneous network. We need either repeated time point measurements or more time points during the observed process.

Very recently, Rao et al. [RHISE07] also proposed to infer such a time varying network from gene expression data. To such an aim, they first detect changepoints position for each gene, then they cluster genes having similar behavior and sharing the same changepoints and finally they infer the network topology describing relationships between genes within each cluster. We propose here to simultaneously infer the changepoint position and the structure of the networks within phases thanks to a reversible jump Monte Carlo Markov Chain (MCMC) approach.

Indeed, we face a joint detection/estimation problem where the dimension is typically not fixed. Reversible jump MCMC methods were especially introduced by Green [Gre95] to jointly solve these issues. Thus we develop an efficient stochastic algorithm based on two embedded reversible jump MCMC levels. One level is for multiple changepoint detection and the other one for model selection within the phases delimited by the changepoints.

This approach is used to analyze the response developed by the yeast S. cerevisiae after addition of the antimitotic drug benomyl from multi-knockout data. The first results notably allowed to point out the chronological effect of transcription factor YAP1 deletion.

4.1.2 Non homogeneous network Model

Letpbe the number of ’target’ genes,qthe number of ’factor’ genes. We denote byP ={1, .., p}andQ={1, .., q} the set of target and factor genes respectively. We consider two random variables vectorsYt={Ytⁱ;i ∈P} and Xt={Xt^j;j∈Q}referring to an expression measure for respectively theptarget genes and theq factor genes at timet.

60 CHAPTER 4. INFERRING TIME-DEPENDENT NETWORKS WITH RJMCMC

Figure 4.1: Non homogeneous network with 2 changepoints.

We propose two application frameworks for our modeling. In the first setting, the set of potential factors are the genes expression levels observed at the previous time. Then for allj≤q, for allt >1,X_t^j =Y_t^j₋₁ andq=p.

In that case, we use the Dynamic Bayesian Network (DBN) modeling described in section 3.2. In the second setting, we study the effect ofqfactor genes differing from theptarget genes. Then the effect of factor genes are either simultaneous or time-delayed. We illustrate the latter case in section 4.4 with a real data analysis based on knocked out transcription factors.

Multiple changepoints

Letnbe the number of time measurements andN ={1, .., n} be the set of time measurements. In practice, the time points are not necessary equidistant. Nevertheless, one may consider in that case that the biologist chose time measurements homogeneously in terms of relative reaction rate. We allow the relationships between genes to vary across the process by exploiting a multiple changepoint model. For all target gene i, we introduce an unknown numberkⁱof changepoints definingkⁱ+ 1 non-overlapping phases. Phaseh= 1, .., kⁱ+ 1 starts at changepointξⁱ_h₋₁ and stops beforeξ_hⁱ, whereξⁱ= (ξ₀ⁱ, ..., ξ_hⁱ₋₁, ξⁱ_h, ..., ξⁱ_ki+1) withξ_hⁱ₋₁< ξ_hⁱ. To delimitate the bounds, we setξ₀ⁱ = 1 andξ_kⁱi+1 =n+ 1. Thus vectorξⁱ has length |ξⁱ|=kⁱ+ 2. We denote by ξ={ξⁱ}ⁱ∈P the set of changepoints.

Within any phaseh, the way target geneiexpression depends on theqfactor genes expressions is defined by both,

a set ofsⁱ_h predictors denoted byτ_hⁱ ={j1, ..., j_sⁱ

h},τ_hⁱ ⊆Q,|τ_hⁱ|=sⁱ_h,

and a set of parametersθ_hⁱ = ((aîj_h)j∈0..q, σⁱ_h);aîj_h ∈R,σⁱ_h>0. For j6= 0, we impose moreover thataîj_h = 0 ifj /∈τ_hⁱ.

Letk=Pp

i=1kⁱ be the total number of changepoints. We set to k the maximal number of changepoints and sthe maximal number of predictors for each phase, that is we impose k ≤k and for all genei, for all phase h, sⁱ_h≤k.

Dans le document The DART-Europe E-theses Portal (Page 58-61)