Asymptotic properties of the residuals process for stationary marked Gibbs point processes

(1)

HAL Id: inria-00386660

https://hal.inria.fr/inria-00386660

Submitted on 22 May 2009

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Asymptotic properties of the residuals process for stationary marked Gibbs point processes

Jean-François Coeurjolly

To cite this version:

Jean-François Coeurjolly. Asymptotic properties of the residuals process for stationary marked Gibbs

point processes. 41èmes Journées de Statistique, Société Française de Statistique, May 2009, Bordeaux,

France. �inria-00386660�

(2)

Asymptotic properties of the residuals process for stationary marked Gibbs point processes

J.-F. Coeurjolly

Sagag Team, Laboratory Jean Kuntzmann/ CNRS, BSHM, 1251 Av. Centrale, Grenoble Cedex 09 (France)

R´esum´e

Dans le contexte des processus ponctuels spatiaux, la notion de résidus a été proposée très récemment par Baddeley et al. (2005). Elle fournit un outil de diagnostic très riche pour juger de la qualité d’ajustement d’un modèle paramétrique de processus ponctuel. Ces résidus sont une extension de ceux définis en dimension un pour des processus de comptage et sont basés sur des versions empiriques des deux termes apparaissant dans la célèbre Formule de Georgii-Nguyen-Zeissin. Cet exposé présentera les propriétés asymptotiques du processus des résidus pour des processus ponctuels marqués de Gibbs stationaires. En particulier, nous obtenons la consistance et la normalité asymptotique pour une large classe de résidus, incluant notamment ceux définis dans Baddeley et al. (2005) et partiellement étudiés dans Baddeley et al. (2008) (résidus bruts, résidus inversés, résidus de Pearson). Nous généralisons ce résultat pour proposer un test d’adéquation basé sur des estimations fonctionnelles de la fonction d’espace vide qui caractérise mieux la distribution d’un processus ponctuel spatial qu’un seul type de résidus.

Abstract

In the context of spatial point processes, the notion of residuals has been proposed very recently by Baddeley et al. (2005). It provides a very rich diagnostic tool to investigate the quality of adjustment of a parametric spatial point process. Residuals are an extension of the ones available in one dimension for counting processes and are based on empirical versions of both terms appearing in the well-known Georgii-Nguyen-Zeissin Formula. This talk focuses on asymptotic properties of the residuals process for stationary marked Gibbs point processes. In particular, the consistency and the asymptotic normality are obtained for a wide class of residuals including the classical ones proposed by Baddeley et al. (2005) and partially studied by Baddeley et al. (2008) (raw residuals, inverse residuals, Pearson residuals). We also generalize this result for defining a goodness-of-fit test for these processes based on functionals estimations of the empty space function which better characterizes the distribution of a spatial point process rather than one type of residuals.

1 Background, examples and definitions

Let us start with some general notation on point processes. The data consist in a countable subset ϕ = {x

^m₁¹

, . . . , x

^m_nⁿ

} (where n is not fixed), called a configuration of marked points (x

i

∈ R

²

(for the sake of simplicity) are location and m

i

∈ M are marks). The state space will de denoted by S = R

²

× ▼ . Let λ

²

denote the Lebesgue measure on R

²

, λ

^♠

the probability mark measure and let µ := λ

²

⊗ λ

^♠

. Finally denote by ϕ

Λ

:= ϕ ∩ (Λ ∩ M ), Λ ⊂ R

²

realization of the variable Φ

Λ

.

The (marked) Poisson point process constitutes the reference model for spatial point processes and is related to the notion of independence. The realization of a Gibbs point process is the realization of some variable with conditional density (conditionnally on some fixed outside configuration ϕ

^o

) with respect to the Poisson process expressed as

f

^Φ^Λ^c^=ϕ

o Λc

ΦΛ

(ϕ

Λ

) ∝ exp (−V (ϕ

Λ

|ϕ

^o_Λ^c

))

(3)

where V (ϕ

Λ

|ϕ

^o_Λ^c

) := V (ϕ

Λ

∪ ϕ

^o_Λ^c

) − V (ϕ

^o_Λ^c

) measures the energy to insert the configuration ϕ

Λ

into ϕ

^o_Λ^c

. In the rest of the paper, the framework is restricted to stationary marked Gibbs point processes based on energy function V (ϕ; θ), invariant by translation and parametrized in an exponential form by some θ ∈ Θ where Θ is some compact set of R

^p

(p ≥ 1), i.e.

V (ϕ; θ) := θ

^T

v(ϕ), where v(ϕ) = (v

₁

(ϕ), . . . , v

_p

(ϕ)).

Models may be defined by defining through the local energy to insert a marked point x

^m

into the configuration ϕ

V (x

^m

|ϕ; θ) := V (ϕ ∪ {x

^m

}; θ) − V (ϕ; θ) = θ

^T

v (x

^m

|ϕ) . Let us cite some well-known examples :

• Multi-type Poisson point process : M = {1, . . . , M } V (ϕ; θ) =

X

M

m=1

θ

^m

|ϕ

^m

|, V (x

^m

|ϕ; θ) = θ

^m

.

• Multi-type Strauss point process : M = {1, . . . , M } V (ϕ; θ) =

X

M

m=1

θ

^m₁

|ϕ

^m

| + X

M

m=1

θ

^m₂

X

{x^m,y^m}∈P2(ϕ)

1

[D1^m,D2^m]

(||y − x||).

Alternatively

V (x

^m

|ϕ; θ) = X

M

m=1

θ

₁^m

+ X

M

m=1

θ

₂^m

X

y^m∈ϕ

1

[D^m1 ,D^m2]

(||y − x||).

This process is defined for θ

^m₂

≥ 0 and D

^m₁

= 0 (inhibition assumption) or when θ

^m₂

∈ R and D

^m₁

= δ > 0 (hard-core assumption).

• Area interaction point process : ▼ = {0}, R > 0

V (ϕ; θ) = θ

1

|ϕ| + θ

2

|∪

_x∈ϕ

B(x, R)| , where B(x, R) is the ball centered at x with radius R.

Concerning such models, the first statistical challenge is to estimate the true value of

the parameter vector (θ

^⋆

assumed to be in ˚ Θ). This may be done via the maximisation

of the likelihood (MLE) or the pseudo-likelihood (MPLE). The last one is an excellent

alternative to the MLE since it avoids the computation of the normalizing constant which

(4)

is particularly costly (see the next section for an accurate definition and Billiot, Coeurjolly and Drouilhet (2008) for asymptotic results). Once you have estimated the parameter vector of interest, the second challenge is to quantify the quality of adjusment of the model to data. Baddeley et al. (2005) defined a large class of such measures based on the well-known Georgii-Nguyen-Zeissin Formula, expressed here for stationary marked Gibbs point processes

Theorem 1 (Georgii-Nguyen-Zeissin Formula) For any function h(·, θ) : Ω

f

→ R such that the following quantities are finite

E

h 0

^M

|Φ; θ

^⋆

e

^−V

(

⁰^M^|Φ;θ^⋆

)

= E h 0

^M

f |Φ \ 0

^M

; θ

^⋆

(1) where for any θ ∈ Θ, (m, ϕ) ∈ ▼ × Ω

f

, h(x

^m

|ϕ; θ) := h(ϕ ∪ x

^m

; θ) − h(ϕ; θ) and where M ∼ λ

^♠

Now define the innovations and the residuals process based on empirical versions of both terms appearing in (1).

Definition 1 For any bounded domain Λ, let us define

• Innovations process:

I

Λ

(ϕ; h, θ

^⋆

) := 1

|Λ|

Z

Λ×▼

h(x

^m

|ϕ; θ

^⋆

)e

^−V^(x^m^|ϕ;θ^⋆⁾

µ(dx

^m

) − 1

|Λ|

X

x^m∈ϕΛ

h(x

^m

|ϕ \ x

^m

; θ

^⋆

),

• Residuals process: let θ b

_n

(ϕ) be an estimate of θ

^⋆

based on ϕ

_Λ

R

Λ

(ϕ; h) := 1

|Λ|

Z

Λ×▼

h(x

^m

|ϕ; θ b

n

(ϕ))e

^−V

(

^x^m^|ϕ;^b^θⁿ^(ϕ)

) µ(dx

^m

)− 1

|Λ|

X

x^m∈ϕΛ

h(x

^m

|ϕ\ x

^m

; θ b

n

(ϕ))

Clearly, the last notion is the most interesting as a practical point of view. Let us describe the main examples considered by Baddeley et al. (2005) (in the context of stationary point processes).

1. h(x

^m

|ϕ, θ) = 1: raw innovations and residuals

2. h(x

^m

|ϕ, θ) = e

^V^(x^m^|ϕ;θ)

: inverse innovations and residuals.

3. h(x

^m

|ϕ, θ) = e

^V^(x^m^|ϕ;θ)/2

: Pearson innovations and residuals.

In particular, one may note that the raw residuals is a difference of two estimates of the

intensity of the point process : the first one is a parametric one and depends on the model

while the second one is a non parametric one (since it equals to |ϕ

Λ

|/|Λ|).

(5)

2 Asymptotic results of the residuals process

In order to present asymptotic results, let us first consider general assumptions, denoted by [Mod], on our models:

1. [Mod:S] Stability of the local energy: ∃K ≥ 0, ∀(m, ϕ) ∈ M × Ω

_f

V (0

^m

|ϕ; θ) ≥ −K.

2. [Mod:L] Locality of the local energy : ∃D ≥ 0, ∀(m, ϕ) ∈ M × Ω

f

V (0

^m

|ϕ; θ) = V 0

^m

|ϕ

_B(0,D)

; θ . 3. [Mod:I] Integrability: ∃κ

^(sup)_i

≥ 0, k

i

∈ N , ∀(m, ϕ) ∈ M × Ω

f

v

_i

(0

^m

|ϕ) ≤ κ

^(sup)_i

|ϕ

_B(0,D)

|

^kⁱ

, i = 1, . . . , p.

4. [Mod:Id] Identifiability condition :: for all θ ∈ Θ \ θ

^⋆

, P V 0

^M

|Φ; θ

6= V 0

^M

|Φ; θ

^⋆

> 0

The assumptions [Mod:S] and [Mod:L] are quite well-known in the context of Gibbs point processes. In particular, they ensure the existence of stationary measures. These assumptions include a very large class of known parametric models of marked point processes : Poisson process, multi-type Poisson process, multi-strauss marked point process, Strauss-disc type process, Geyer’s triplet point process, Area interaction point process,. . . . It also include many models where the interaction is no more measured on the complete graph (P

₂

(ϕ)) but on some more structured one such as the k-nearest neighbour graph or the Delaunay graph (e.g. [3]). Some of these models are known under the names of Ord’s model where the energy depends the area of Vorono¨ı cells.

In the rest of the paper, the data is assumed to be observed in the domain Λ

n

⊕ D

^∨

(where D

^∨

≥ D), where the domain Λ

n

= [−n, n]

²

(for the sake of simplicity). Moreover, the estimate appearing in the definition of the residuals will be the maximum pseudolikelihood estimate, i.e.

θ b

n

(ϕ) = θ b

_n^{M P LE}

(ϕ) := argmax

_θ∈Θ

LP L

Λⁿ

(ϕ; θ)

where the log-pseudo likelihood on the domain Λ

n

, LP L

Λⁿ

(ϕ; θ) is defined by LP L

Λ

(ϕ; θ) = −

Z

Λ×▼

e

^−V^(x^m^|ϕ;θ)

µ(dx

^m

) − X

x^m∈ϕΛ

V (x

^m

|ϕ \ x

^m

; θ) ,

see Billiot, Coeurjolly and Drouilhet (2008) for more details and asymptotic results (con-

sistency and asymptotic normality). Finally, consider the following assumptions denoted

by [H] and related to the analysing function h:

(6)

1. h(·; θ) looks like V (·; θ) (invariance by translation, locality).

2. Assume that for all (m, ϕ) ∈ ▼ × Ω

f

, h(0

^m

|ϕ; ·) is differentiable for θ ∈ V (θ

^⋆

) and assume that for j = 1, . . . , p and for all θ ∈ V(θ

^⋆

)

E

|h(0

^M

|Φ; θ)|

³

e

^−V

(

⁰^M^|Φ;θ

)

<+∞ and E

∂h

∂θ

j

(0

^M

|Φ; θ)

e

^−V

(

⁰^M^|Φ;θ

)

< +∞.

Now we are ready to present our main result concerning the innovations and the residuals processes

Theorem 2 Under the assumptions [Mod] and [H] and under the true model, (i) for P

θ^⋆

−a.e. ϕ, I

Λⁿ

(ϕ; h, θ

^⋆

) and R

Λⁿ

(ϕ; h) converge towards 0, as n → +∞.

(ii) the following convergences hold in distribution, as n → +∞

|Λ

n

|

^1/2

I

Λn

(Φ; h, θ

^⋆

) → N



0, X

i∈B(0,⌈D⌉)

E (I

∆0

(Φ; h, θ

^⋆

)I

∆i

(Φ; h, θ

^⋆

))





|Λ

_n

|

^1/2

R

Λⁿ

(Φ; h) → N



 0, X

i∈B(0,⌈D⌉)

E (W

∆0

(Φ; h, θ

^⋆

)W

∆i

(Φ; h, θ

^⋆

))





where

W

∆i

(ϕ; h, θ

^⋆

) := I

∆i

(ϕ; h, θ

^⋆

) + U

⁽²⁾

(θ

^⋆

)

⁻¹

LPL

⁽¹⁾_∆_i

(ϕ; θ

^⋆

)

^T

Z(h; θ

^⋆

).

∆

i

is the unit square centered at i, LPL

⁽¹⁾_∆_i

(ϕ; θ

^⋆

) is the gradient vector of LP L

∆i

(ϕ; θ) (evaluated in θ

^⋆

), U

⁽²⁾

(θ

^⋆

) and Z(h; θ

^⋆

) are defined for j, k = 1, . . . , p by

(U

⁽²⁾

(θ

^⋆

))

_j,k

= E(v

_j

(0

^M

|Φ)v

_k

(0

^M

|Φ)e

^−V

(

⁰^M^|Φ;θ^⋆

)) Z

j

(h; θ

^⋆

) := E(v

j

(0

^M

|Φ)h(0

^M

|Φ)e

^−V

(

⁰^M^|Φ;θ^⋆

)).

3 Applications to the empty space function

Definition 2 (Empty space function) The empty space function F is the distribution of the distance from the origin (or another fixed point) to the nearest point in Φ, i.e.

F (r) = P (Φ ∩ B(0, r) 6= ∅)

(7)

It is known that the F function characterizes better the distribution of the spatial point process rather than one type of residuals. What is interesting in the approach developped in the previous section is that by taking the function

h

r

(x

^m

|ϕ, θ) := 1

[0,r]

(d(x

^m

, ϕ)e

^V^(x^m^|ϕ;θ)

= ˜ h(r)

then the residuals R

Λn

(ϕ; ˜ h(r)) corresponds to a different of two estimations of the empty space function at distance r fixed (the first one is non parametric (since it equals to

|Λ

_n

|

⁻¹

R

Λ×▼

1

_[0,r]

(d(x

^m

, ϕ)µ(dx

^m

))) while the second one is parametric).

Corollary 1 Under the assumptions of Theorem 2 then for all (r

₁

, . . . , r

_d

)

^T

∈ (]0, +∞[)

^d

, we have the following convergence in distribution as n → +∞

|Λ

n

|

^1/2

R

Λn

(Φ; ˜ h(r

1

)), . . . , R

Λn

(Φ; ˜ h(r

d

))

T

→ ( G (r

1

; θ

^⋆

), . . . , G (r

d

; θ

^⋆

))

^T

, where G (·; θ

^⋆

) is a Gaussian process with covariance structure given, for r, s > 0, by

Cov ( G (r; θ

^⋆

), G (s; θ

^⋆

)) = X

i∈B(0,⌈D⌉)

E (W

_∆0

(Φ; h(r), θ

^⋆

)W

_∆i

(Φ; h(s), θ

^⋆

))

Hence, one can construct an asymptotic test based (for example) on the following result:

choose r

1

, . . . , r

d

> 0 then the following convergence in distribution holds as n → +∞

|Λ

_n

| X

d

i=1

R

_Λn

(Φ; ˜ h(r

_i

))

²

→ X

d

i=1

G (r

_i

; θ

^⋆

)

²

.

Note that in the different previous results, it is possible to tabulate the different distri- butions via Monte-Carlo experiments. A perspective of this work is to investigate the previous results in a simulation study and in practice. As a theoretical point of view, a perspective could be to obtain a weak convergence instead a finite-dimensional one.

References _:

[1] A. Baddeley, R. Turner, J. Mller and M. Hazelton (2005) Residual analysis for spatial point processes, Journal of the Royal Statistical Society, 67(3), 617–666.

[2] A. Baddeley, J. Mller and A.G. Pakes (2008) Properties of residuals for spatial point processes, Annals of the Institute of Statistical Mathematics, 60(7), 627–649

[3] E. Bertin, J.-M. Billiot and R. Drouilhet (1999) k-nearest neighbour Gibbs point processes, Markov processes and related fields, 5(2), 219–234

[4] J.-M. Billiot, J.-F. Coeurjolly and R. Drouilhet (2008) Maximum pseudolikelihood esti-

mator for exponential family models of marked Gibbs point processes, Electronic Journal

of Statistics, 2, 234–264.