Proofs of the probabilistic results for Hawkes processes

6. Simulations for the multivariate Hawkes process

7.4. Proofs of the probabilistic results for Hawkes processes

7.4.1. Proof of Lemma1

LetK(n) denote the vector of the number of descendants in then’th generation from a single ancestral point of type`, defineK(0) =e`and letW(n) =Pn

k=0K(k) denote the total number of points in the firstngenerations. Define forθ∈R^M

φ_`(θ) = logE`e^θ^T^K(1).

Thus, φ_`(θ) is the log-Laplace transform of the distribution of K(1) given that there is a single initial ancestral point of type `. We define the vector φ(θ) by φ(θ)⁰ = (φ₁(θ), ..., φ_M(θ)). Note that φ only depends on the law of the number of children per parent, i.e. it only depends on Γ. Then

E^`e^θ^T^W⁽ⁿ⁾ = E^` Definingg(θ) =θ+φ(θ) we arrive by recursion at the formula

E`e^θ^T^W⁽ⁿ⁾ = E`e^g^◦(n−1)^(θ)^T^K(1)+θ^T^W⁽⁰⁾

= e^φ(g^◦(n−1)^(θ))^`^+θ^`

= e^g^◦n^(θ)^`.

where for anyn, g^◦(n) =g◦ · · · ◦g n times. Or, in other words, we have the following representation

logE`e^θ^T^W(n)=g^◦n(θ)_` of the log-Laplace transform ofW(n).

Below we show that φ is a contraction in a neighborhood containing 0, that is, for somer >0 and a constantC <1 (and a suitable norm),||φ(s)|| ≤C||s|| for||s|| ≤r. If θis chosen such that

||θ||

1−C ≤r

we have||θ|| ≤r, and if we assume thatg^◦k(θ)∈B(0, r) fork= 1, . . . , n−1 then

||g^◦n(θ)|| ≤ ||θ||+||φ(g^◦(n−1)(θ))||

≤ ||θ||+C||g^◦(n−1)(θ)||

≤ ||θ|| 1 +C+C²+. . .+Cⁿ

≤ r

Thus, by induction, g^◦n(θ)∈B(0, r) for all n≥1. Since n7→W_m(n) is increasing and goes toW_m(∞) forn→ ∞, withW_m(∞) the total number of points in a cluster of type m, and sinceW =P

mW_m(∞) =1^TW(∞), we have by monotone convergence that for ϑ∈R

logE`e^ϑW = lim

n→∞g^◦n(ϑ1)_`.

By the previous result, the right hand side is bounded if |ϑ| is sufficiently small. This completes the proof up to proving thatφis a contraction.

To this end we note thatφis continuously differentiable (onR^M in fact, but a neigh-borhood around 0 suffices) with derivativeDφ(0) = Γ at 0. Since the spectral radius of Γ is strictly less than 1 there is aC <1 and, by the Householder theorem, a norm|| · ||

onR^M such that for the induced operator norm of Γ we have

||Γ||= max

x:||x||≤1||Γx||< C

Since the norm is continuous andDφ(s) is likewise there is anr >0 such that

||Dφ(s)|| ≤C <1

for||s|| ≤r. This, in turn, implies thatφis Lipschitz continuous in the ballB(0, r) with Lipschitz constantC, and sinceφ(0) = 0 we get

||φ(s)|| ≤C||s||

for||s|| ≤r. This ends the proof of the lemma.

Note that we have not at all used the explicit formula forφabove, which is obtainable and simple since the offspring distributions are Poisson. The only thing we needed was the fact thatφis defined in a neighborhood around 0, thus that the offspring distributions are sufficiently light-tailed.

7.4.2. Proof of Proposition2

We use the cluster representation, and we note that any cluster with ancestral point in [−n−1,−n] must have at least n+ 1− dAepoints in the cluster if any of the points are to fall in [−A,0). This follows from the assumption that all theh^(m)_` -functions have support in [0,1]. With ˜NA,`the number of points in [−A,0) from a cluster with ancestral points of type` we thus have the bound

N˜_A,`≤X

n An

k=1

max{Wn,k−n+dAe,0}

where An is the number of ancestral points in [−n−1,−n] of type ` and Wn,k is the number of points in the respective clusters. Here the An’s and the Wn,k’s are all inde-pendent, the A_n’s are Poisson distributed with mean ν_` and the W_n,k’s are i.i.d. with the same distribution asW in Lemma1. Moreover,

H_n(ϑ_`) :=Eè^ϑ^`^max{W^−n+dAe,0}≤P`(W ≤n− dAe) +e^−ϑ^`^(n−dAe)Eè^ϑ^`^W, which is finite for|ϑ`|sufficiently small according to Lemma1. Then we can compute an upper bound on the Laplace transform of ÑA,`:

Ee^ϑ^`^N^˜^A,` ≤ Y

k=1

e^ϑ^`^max{W^n,k^−n+dAe,0}|A_n

≤ Y

EH_n(ϑ_`)^Aⁿ

= Y

e^ν^`^(Hⁿ^(ϑ^`⁾⁻¹⁾

= e^ν^`^Pⁿ^(Hⁿ^(ϑ^`⁾⁻¹⁾ SinceH_n(ϑ_`)−1≤e^−ϑ^`^(n−dAe)E`e^ϑ^`^W we haveP

n(H_n(ϑ_`)−1)<∞, which shows that the upper bound is finite. To complete the proof, observe thatN_[−A,0)=P

`N˜A,`where N˜A,`for`= 1, . . . , M are independent. Since all variables are positive, it is sufficient to takeθ= min`ϑ`.

7.4.3. Proof of Proposition3

In this paragraph, the notation simply denotes a generic positive absolute constant that may change from line to line. The notation θ1,θ2,... denotes a positive constant depending onθ₁, θ₂, . . .that may change from line to line.

Let

u=C1σlog^3/2(T)√

T +C2b(log(T))^2+η, (7.13)

where the choices ofC₁ and C₂ will be given later. For any positive integerksuch that

Similarly to [51], we introduce ( ˜M_q^x)q a sequence of independent Hawkes processes, each being stationary with intensities per mark given byψ_t^(m). For eachq, we then introduce M_q^x the truncated process associated with ˜M_q^x, where truncation means that we only consider the points lying in [2qx−A,2qx+x].So, if we set

whereT_e represents the time to extinction of the process. More preciselyT_e is the last point of the process if in the cluster representation only ancestral points before 0 are appearing. For more details, see section 3 of [51]. So, denoting a_l the ancestral points with markslandH_a^l

lthe length of the corresponding cluster whose origin isa_l, we have:

T_e= max

where ˜N^(l) denotes the process associated with the ancestral points with marksl. So, is the number of points in the cluster. But if all the interaction functions have support in [0,1], one always have thatH₀^l < W. Hence

Now, let us focus on the first termB of (7.14), where B =P

Let us consider some ˜N where ˜N will be fixed later and let us define the measurable events

We haveP(Ω^c)≤P

qP(Ω^c_q).Each Ω_q can also be easily controlled. Indeed it is sufficient to split [2qx−A,2qx+x] in intervals of sizeA(there are aboutα,A,f0log(T) of those) and require that the number of points in each subinterval is smaller than ˜N/2. By stationarity, we obtain that

P(Ω^c_q)≤^α,A,f0log(T)P(N_[−A,0)>N˜/2).

Using Proposition2 withu=dN˜/2e+ 1/2, we obtain:

P(Ω^c_q)≤α,A,f0log(T) exp(−α,A,f0N˜) andP(Ω^c)≤α,A,f0Texp(−α,A,f0N˜).

(7.15) Note that this control holds for any positive choice of ˜N. Hence this gives also the following Lemma that will be used later.

Lemma 3. For any R>0,

P there existst∈[0, T]| M_q^x|_[t−A,t)>R)≤^α,A,f0Texp(−^α,A,f0R).

Hence by taking ˜N =C3log(T) forC3 large enough this is smaller than^α,A,f0T^−α⁰, whereα⁰= max(α,2).

It remains to obtain the rate ofD:=P(P

qFq ≥u/2 and Ω). For any positive constant θthat will be chosen later, we have:

D ≤ e⁻^θu² E e^θ^P^q^F^qY

1Ωq

≤ e⁻^θu² Y

E e^θF^q1Ω_q

(7.16)

since the variables (M_q^x)_q are independent. But E e^θF^q1Ω_q

= 1 +θE(Fq1Ω_q) +X

j≥2

θ^j

j!E(F_q^j1Ω_q) andE(Fq1Ω_q) =E(Fq)−E(Fq1Ω^c_q) =−E(Fq1Ω^c_q).

Next note that if for any integerl, lN˜ <sup

M_q^x|_[t−A,t)≤(l+ 1) ˜N then

|Fq| ≤xb[(l+ 1)^ηN˜^η+ 1] +xE(Z).

Hence, cutting Ω^c_q in slices of the type {lN˜ < sup_tM_q^x _[t−A,t) ≤ (l+ 1) ˜N } and using Lemma3, we obtain by takingC₃large enough,

|E(Fq1Ω_q)|=|E(Fq1Ω^c_q)| ≤ same way, one can bound

E(F_q^j1Ω_q)≤E(F_q²1Ω_q)z_b^j−2,

using that ln(1 +u) ≤ u. It is sufficient now to recognize a step of the proof of the Bernstein inequality (weak version see [41, p25]). Since kz₁ = α,η,sbT^1−α⁰/(log(T)),

one can chooseα⁰>1, C₁andC₂in the definition (7.13) ofu(not depending onb) such thatu/2−kz₁≥√

2kz_vz+¹₃z_bzfor some z=C₄log(T), whereC₄is a constant. Hence

D≤exp



−θ(p

2kz_vz+1

3z_bz) +kX

j≥2

z_vz_b^j−2θ^j j!



.

One can choose accordinglyθ (as for the proof of the Bernstein inequality) to obtain a bound ine^−z. It remains to chooseC₄large enough and only depending onα, η, Aandf₀ to guarantee thatD≤e^−z≤^α,η,A,f0T^−α.This concludes the proof of the proposition.

7.4.4. Proof of Proposition4

Let Q denote a measure such that under Q the distribution of the full point process restricted to (−∞,0] is identical to the distribution underPand such that on (0,∞) the process consists of independent components each being a homogeneous Poisson process with rate 1. Furthermore, the Poisson processes should be independent of the process on (−∞,0]. From Corollary 5.1.2 in [36] the likelihood process is given by

Lt= exp M t−X

Z t 0

λ^(m)_u du+X

Z t 0

logλ^(m)_u dN_u^(m)

and we have fort≥0 the relation

EPκ_t(f)²=EQκ_t(f)²L_t, (7.17) where EP and EQ denote the expectation with respect to P and Q respectively. Let, furthermore, ˜N1 =N_[−1,0) denote the total number of points on [−1,0). Proposition4 will be an easy consequence of the following lemma.

Lemma 4. If the point process is stationary under P, if e^d≤λ^(m)_t ≤a(N1+ ˜N1) +b

for t ∈[0,1] and for constants d∈ R and a, b > 0, and if EP(1 +ε)^N^˜¹ < ∞ for some ε >0then for any f,

Q(f,f)≥ζ||f||² (7.18)

for some constantζ >0.

Proof. We use H¨olders inequality on κ₁(f)^p²L

1 p

1 andκ₁(f)²^qL⁻

1 p

1 to get EQκ1(f)²≤ EQκ1(f)²L1¹_p

EQκ1(f)²L⁻

q p

¹q

=Q(f,f)¹^p

EQκ1(f)²L^1−q₁ ¹_q

(7.19)

where ¹_p +¹_q = 1. We chooseq≥1 (and thusp) below to make q−1 sufficiently small.

For the left hand side we have by independence of the homogeneous Poisson processes that iff = (µ,(g`)`=1,...,M),

EQκ1(f)² = (EQκ1(f))²+VQκ1(f)

= µ+X

Z 1 0

g`(u)du

!²

Z 1 0

g`(u)²du.

Exactly as on page 32 in [52] there existsc⁰ >0 such that EQκ₁(f)²≥c⁰ µ²+X

Z 1 0

g_`²(u)du

=c⁰||f||². (7.20) To bound the second factor on the right hand side in (7.19) we observe, by assumption, that we have the lower bound

L1≥e^M(1−b)e^(d−aM)N¹e^−aM^N^˜¹

on the likelihood process. Under Qwe have that (κ1(f), N1) and ˜N1 are independent, and withρ=e(q−1)(aM−d) and ˜ρ=e^(q−1)(aM) we get that

EQκ₁(f)²L^1−q₁ ≤e(q−1)M(b−1)

EQρ˜^N^˜¹EQκ₁(f)²ρ^N¹.

Here we chooseqsuch that ˜ρis sufficiently close to 1 to make sure thatEQρ˜^N^˜¹ =EPρ˜^N^˜¹ <

∞(see Proposition2). Moreover, by Cauchy-Schwarz’ inequality κ²₁(f)≤ µ²+X

Z 1−

g_`²(1−u)dN_u^(`)

(1 +N₁). (7.21) Under Q the point processes on (0,∞) are homogeneous Poisson processes with rate 1 and N1, the total number of points, is Poisson. This implies that conditionally on (N₁⁽¹⁾, . . . , N₁^(M)) = (n⁽¹⁾, . . . , n^(M)) the n^(m)-points for them’th process are uniformly distributed on [0,1], hence

EQκ1(f)²L^1−q₁ ≤ µ²+X

Z 1 0

g_`²(u)du

e^(q−1)M^(b−1)EQρ˜^N^˜¹EQ(1 +N1)²ρ^N¹

| {z }

c⁰⁰

=c⁰⁰||f||². (7.22) Combining (7.20) and (7.22) with (7.19) we get that

c⁰||f||²≤(c⁰⁰)¹^q||f||²^qQ(f,f)¹^p or by rearranging that

Q(f,f)≥ζ||f||² withζ= (c⁰)^p/(c⁰⁰)^p−1.

For the Hawkes process it follows that ifν^(m)>0 and if sup

t∈[0,1]

h^(m)_` (t)<∞

forl, m= 1, . . . , M then fort∈[0,1] we havee^d≤λ^(m)_t ≤a(N1+ ˜N1) +b with d= logν^(m), a= max

l sup

t∈[0,1]

h^(m)_` (t), b=ν^(m).

Proposition2proves that there existsε >0 such thatEP(1 +ε)^N^˜¹ <∞. This completes the proof of Proposition4.

Dans le document 1.3. Notation and overview of the paper (Page 43-52)