HAL Id: inria-00386660
https://hal.inria.fr/inria-00386660
Submitted on 22 May 2009
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de
Asymptotic properties of the residuals process for stationary marked Gibbs point processes
Jean-François Coeurjolly
To cite this version:
Jean-François Coeurjolly. Asymptotic properties of the residuals process for stationary marked Gibbs
point processes. 41èmes Journées de Statistique, Société Française de Statistique, May 2009, Bordeaux,
France. �inria-00386660�
Asymptotic properties of the residuals process for stationary marked Gibbs point processes
J.-F. Coeurjolly
Sagag Team, Laboratory Jean Kuntzmann/ CNRS, BSHM, 1251 Av. Centrale, Grenoble Cedex 09 (France)
R´esum´e
Dans le contexte des processus ponctuels spatiaux, la notion de r´esidus a ´et´e propos´ee tr`es r´ecemment par Baddeley et al. (2005). Elle fournit un outil de diagnostic tr`es riche pour juger de la qualit´e d’ajustement d’un mod`ele param´etrique de processus ponctuel. Ces r´esidus sont une extension de ceux d´efinis en dimension un pour des processus de comptage et sont bas´es sur des versions empiriques des deux termes apparaissant dans la c´el`ebre Formule de Georgii-Nguyen-Zeissin. Cet expos´e pr´esentera les propri´et´es asymptotiques du processus des r´esidus pour des processus ponctuels marqu´es de Gibbs stationaires. En particulier, nous obtenons la consistance et la normalit´e asymptotique pour une large classe de r´esidus, incluant notamment ceux d´efinis dans Baddeley et al. (2005) et partiellement ´etudi´es dans Baddeley et al. (2008) (r´esidus bruts, r´esidus invers´es, r´esidus de Pearson). Nous g´en´eralisons ce r´esultat pour proposer un test d’ad´equation bas´e sur des estimations fonctionnelles de la fonction d’espace vide qui caract´erise mieux la distribution d’un processus ponctuel spatial qu’un seul type de r´esidus.
Abstract
In the context of spatial point processes, the notion of residuals has been proposed very recently by Baddeley et al. (2005). It provides a very rich diagnostic tool to investigate the quality of adjustment of a parametric spatial point process. Residuals are an extension of the ones available in one dimension for counting processes and are based on empirical versions of both terms appearing in the well-known Georgii-Nguyen-Zeissin Formula. This talk focuses on asymptotic properties of the residuals process for stationary marked Gibbs point processes. In particular, the consistency and the asymptotic normality are obtained for a wide class of residuals including the classical ones proposed by Baddeley et al. (2005) and partially studied by Baddeley et al. (2008) (raw residuals, inverse residuals, Pearson residuals). We also generalize this result for defining a goodness-of-fit test for these processes based on functionals estimations of the empty space function which better characterizes the distribution of a spatial point process rather than one type of residuals.
1 Background, examples and definitions
Let us start with some general notation on point processes. The data consist in a countable subset ϕ = {x
m11, . . . , x
mnn} (where n is not fixed), called a configuration of marked points (x
i∈ R
2(for the sake of simplicity) are location and m
i∈ M are marks). The state space will de denoted by S = R
2× ▼ . Let λ
2denote the Lebesgue measure on R
2, λ
♠the probability mark measure and let µ := λ
2⊗ λ
♠. Finally denote by ϕ
Λ:= ϕ ∩ (Λ ∩ M ), Λ ⊂ R
2realization of the variable Φ
Λ.
The (marked) Poisson point process constitutes the reference model for spatial point processes and is related to the notion of independence. The realization of a Gibbs point process is the realization of some variable with conditional density (conditionnally on some fixed outside configuration ϕ
o) with respect to the Poisson process expressed as
f
ΦΛc=ϕo Λc
ΦΛ
(ϕ
Λ) ∝ exp (−V (ϕ
Λ|ϕ
oΛc))
where V (ϕ
Λ|ϕ
oΛc) := V (ϕ
Λ∪ ϕ
oΛc) − V (ϕ
oΛc) measures the energy to insert the configu- ration ϕ
Λinto ϕ
oΛc. In the rest of the paper, the framework is restricted to stationary marked Gibbs point processes based on energy function V (ϕ; θ), invariant by translation and parametrized in an exponential form by some θ ∈ Θ where Θ is some compact set of R
p(p ≥ 1), i.e.
V (ϕ; θ) := θ
Tv(ϕ), where v(ϕ) = (v
1(ϕ), . . . , v
p(ϕ)).
Models may be defined by defining through the local energy to insert a marked point x
minto the configuration ϕ
V (x
m|ϕ; θ) := V (ϕ ∪ {x
m}; θ) − V (ϕ; θ) = θ
Tv (x
m|ϕ) . Let us cite some well-known examples :
• Multi-type Poisson point process : M = {1, . . . , M } V (ϕ; θ) =
X
Mm=1
θ
m|ϕ
m|, V (x
m|ϕ; θ) = θ
m.
• Multi-type Strauss point process : M = {1, . . . , M } V (ϕ; θ) =
X
Mm=1
θ
m1|ϕ
m| + X
Mm=1
θ
m2X
{xm,ym}∈P2(ϕ)
1
[D1m,D2m](||y − x||).
Alternatively
V (x
m|ϕ; θ) = X
Mm=1
θ
1m+ X
Mm=1
θ
2mX
ym∈ϕ
1
[Dm1 ,Dm2](||y − x||).
This process is defined for θ
m2≥ 0 and D
m1= 0 (inhibition assumption) or when θ
m2∈ R and D
m1= δ > 0 (hard-core assumption).
• Area interaction point process : ▼ = {0}, R > 0
V (ϕ; θ) = θ
1|ϕ| + θ
2|∪
x∈ϕB(x, R)| , where B(x, R) is the ball centered at x with radius R.
Concerning such models, the first statistical challenge is to estimate the true value of
the parameter vector (θ
⋆assumed to be in ˚ Θ). This may be done via the maximisation
of the likelihood (MLE) or the pseudo-likelihood (MPLE). The last one is an excellent
alternative to the MLE since it avoids the computation of the normalizing constant which
is particularly costly (see the next section for an accurate definition and Billiot, Coeurjolly and Drouilhet (2008) for asymptotic results). Once you have estimated the parameter vector of interest, the second challenge is to quantify the quality of adjusment of the model to data. Baddeley et al. (2005) defined a large class of such measures based on the well-known Georgii-Nguyen-Zeissin Formula, expressed here for stationary marked Gibbs point processes
Theorem 1 (Georgii-Nguyen-Zeissin Formula) For any function h(·, θ) : Ω
f→ R such that the following quantities are finite
E
h 0
M|Φ; θ
⋆e
−V(
0M|Φ;θ⋆)
= E h 0
Mf |Φ \ 0
M; θ
⋆(1) where for any θ ∈ Θ, (m, ϕ) ∈ ▼ × Ω
f, h(x
m|ϕ; θ) := h(ϕ ∪ x
m; θ) − h(ϕ; θ) and where M ∼ λ
♠Now define the innovations and the residuals process based on empirical versions of both terms appearing in (1).
Definition 1 For any bounded domain Λ, let us define
• Innovations process:
I
Λ(ϕ; h, θ
⋆) := 1
|Λ|
Z
Λ×▼
h(x
m|ϕ; θ
⋆)e
−V(xm|ϕ;θ⋆)µ(dx
m) − 1
|Λ|
X
xm∈ϕΛ
h(x
m|ϕ \ x
m; θ
⋆),
• Residuals process: let θ b
n(ϕ) be an estimate of θ
⋆based on ϕ
ΛR
Λ(ϕ; h) := 1
|Λ|
Z
Λ×▼
h(x
m|ϕ; θ b
n(ϕ))e
−V(
xm|ϕ;bθn(ϕ)) µ(dx
m)− 1
|Λ|
X
xm∈ϕΛ
h(x
m|ϕ\ x
m; θ b
n(ϕ))
Clearly, the last notion is the most interesting as a practical point of view. Let us describe the main examples considered by Baddeley et al. (2005) (in the context of stationary point processes).
1. h(x
m|ϕ, θ) = 1: raw innovations and residuals
2. h(x
m|ϕ, θ) = e
V(xm|ϕ;θ): inverse innovations and residuals.
3. h(x
m|ϕ, θ) = e
V(xm|ϕ;θ)/2: Pearson innovations and residuals.
In particular, one may note that the raw residuals is a difference of two estimates of the
intensity of the point process : the first one is a parametric one and depends on the model
while the second one is a non parametric one (since it equals to |ϕ
Λ|/|Λ|).
2 Asymptotic results of the residuals process
In order to present asymptotic results, let us first consider general assumptions, denoted by [Mod], on our models:
1. [Mod:S] Stability of the local energy: ∃K ≥ 0, ∀(m, ϕ) ∈ M × Ω
fV (0
m|ϕ; θ) ≥ −K.
2. [Mod:L] Locality of the local energy : ∃D ≥ 0, ∀(m, ϕ) ∈ M × Ω
fV (0
m|ϕ; θ) = V 0
m|ϕ
B(0,D); θ . 3. [Mod:I] Integrability: ∃κ
(sup)i≥ 0, k
i∈ N , ∀(m, ϕ) ∈ M × Ω
fv
i(0
m|ϕ) ≤ κ
(sup)i|ϕ
B(0,D)|
ki, i = 1, . . . , p.
4. [Mod:Id] Identifiability condition :: for all θ ∈ Θ \ θ
⋆, P V 0
M|Φ; θ
6= V 0
M|Φ; θ
⋆> 0
The assumptions [Mod:S] and [Mod:L] are quite well-known in the context of Gibbs point processes. In particular, they ensure the existence of stationary measures. These assumptions include a very large class of known parametric models of marked point pro- cesses : Poisson process, multi-type Poisson process, multi-strauss marked point process, Strauss-disc type process, Geyer’s triplet point process, Area interaction point process,. . . . It also include many models where the interaction is no more measured on the complete graph (P
2(ϕ)) but on some more structured one such as the k-nearest neighbour graph or the Delaunay graph (e.g. [3]). Some of these models are known under the names of Ord’s model where the energy depends the area of Vorono¨ı cells.
In the rest of the paper, the data is assumed to be observed in the domain Λ
n⊕ D
∨(where D
∨≥ D), where the domain Λ
n= [−n, n]
2(for the sake of simplicity). Moreover, the estimate appearing in the definition of the residuals will be the maximum pseudo- likelihood estimate, i.e.
θ b
n(ϕ) = θ b
nM P LE(ϕ) := argmax
θ∈ΘLP L
Λn(ϕ; θ)
where the log-pseudo likelihood on the domain Λ
n, LP L
Λn(ϕ; θ) is defined by LP L
Λ(ϕ; θ) = −
Z
Λ×▼
e
−V(xm|ϕ;θ)µ(dx
m) − X
xm∈ϕΛ
V (x
m|ϕ \ x
m; θ) ,
see Billiot, Coeurjolly and Drouilhet (2008) for more details and asymptotic results (con-
sistency and asymptotic normality). Finally, consider the following assumptions denoted
by [H] and related to the analysing function h:
1. h(·; θ) looks like V (·; θ) (invariance by translation, locality).
2. Assume that for all (m, ϕ) ∈ ▼ × Ω
f, h(0
m|ϕ; ·) is differentiable for θ ∈ V (θ
⋆) and assume that for j = 1, . . . , p and for all θ ∈ V(θ
⋆)
E
|h(0
M|Φ; θ)|
3e
−V(
0M|Φ;θ)
<+∞ and E
∂h
∂θ
j(0
M|Φ; θ)
e
−V(
0M|Φ;θ)
< +∞.
Now we are ready to present our main result concerning the innovations and the residuals processes
Theorem 2 Under the assumptions [Mod] and [H] and under the true model, (i) for P
θ⋆−a.e. ϕ, I
Λn(ϕ; h, θ
⋆) and R
Λn(ϕ; h) converge towards 0, as n → +∞.
(ii) the following convergences hold in distribution, as n → +∞
|Λ
n|
1/2I
Λn(Φ; h, θ
⋆) → N
0, X
i∈B(0,⌈D⌉)
E (I
∆0(Φ; h, θ
⋆)I
∆i(Φ; h, θ
⋆))
|Λ
n|
1/2R
Λn(Φ; h) → N
0, X
i∈B(0,⌈D⌉)
E (W
∆0(Φ; h, θ
⋆)W
∆i(Φ; h, θ
⋆))
where
W
∆i(ϕ; h, θ
⋆) := I
∆i(ϕ; h, θ
⋆) + U
(2)(θ
⋆)
−1LPL
(1)∆i(ϕ; θ
⋆)
TZ(h; θ
⋆).
∆
iis the unit square centered at i, LPL
(1)∆i(ϕ; θ
⋆) is the gradient vector of LP L
∆i(ϕ; θ) (evaluated in θ
⋆), U
(2)(θ
⋆) and Z(h; θ
⋆) are defined for j, k = 1, . . . , p by
(U
(2)(θ
⋆))
j,k= E(v
j(0
M|Φ)v
k(0
M|Φ)e
−V(
0M|Φ;θ⋆)) Z
j(h; θ
⋆) := E(v
j(0
M|Φ)h(0
M|Φ)e
−V(
0M|Φ;θ⋆)).
3 Applications to the empty space function
Definition 2 (Empty space function) The empty space function F is the distribution of the distance from the origin (or another fixed point) to the nearest point in Φ, i.e.
F (r) = P (Φ ∩ B(0, r) 6= ∅)
It is known that the F function characterizes better the distribution of the spatial point process rather than one type of residuals. What is interesting in the approach developped in the previous section is that by taking the function
h
r(x
m|ϕ, θ) := 1
[0,r](d(x
m, ϕ)e
V(xm|ϕ;θ)= ˜ h(r)
then the residuals R
Λn(ϕ; ˜ h(r)) corresponds to a different of two estimations of the empty space function at distance r fixed (the first one is non parametric (since it equals to
|Λ
n|
−1R
Λ×▼
1
[0,r](d(x
m, ϕ)µ(dx
m))) while the second one is parametric).
Corollary 1 Under the assumptions of Theorem 2 then for all (r
1, . . . , r
d)
T∈ (]0, +∞[)
d, we have the following convergence in distribution as n → +∞
|Λ
n|
1/2R
Λn(Φ; ˜ h(r
1)), . . . , R
Λn(Φ; ˜ h(r
d))
T→ ( G (r
1; θ
⋆), . . . , G (r
d; θ
⋆))
T, where G (·; θ
⋆) is a Gaussian process with covariance structure given, for r, s > 0, by
Cov ( G (r; θ
⋆), G (s; θ
⋆)) = X
i∈B(0,⌈D⌉)
E (W
∆0(Φ; h(r), θ
⋆)W
∆i(Φ; h(s), θ
⋆))
Hence, one can construct an asymptotic test based (for example) on the following result:
choose r
1, . . . , r
d> 0 then the following convergence in distribution holds as n → +∞
|Λ
n| X
di=1
R
Λn(Φ; ˜ h(r
i))
2→ X
di=1