HAL Id: hal-01655366
https://hal.inria.fr/hal-01655366
Submitted on 4 Dec 2017
HAL is a multi-disciplinary open access archive for the deposit and dissemination of
sci-L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents
Topologically constrained template estimation via
Morse-Smale complexes controls its statistical
consistency
Nina Miolane, Susan Holmes, Xavier Pennec
To cite this version:
Nina Miolane, Susan Holmes, Xavier Pennec. Topologically constrained template estimation via Morse-Smale complexes controls its statistical consistency. SIAM Journal on Applied Al-gebra and Geometry, Society for Industrial and Applied Mathematics 2018, 2 (2), pp.348-375. �10.1137/17M1129222�. �hal-01655366�
controls its statistical consistency 2 ***** 3 4 Abstract. 5
In most neuroimaging studies, one builds a brain template that serves as a reference anatomy for normalizing
6
the measurements of each individual subject into a common space. Such a template should be representative
7
of the population under study, in order to avoid biasing the subsequent statistical analyses. The template is
8
often computed by iteratively registering all images to the current template and then averaging the intensities
9
of the registered images. Geometrically, the procedure can be summarized as follows: it is the computation of
10
the template as the “Fr´echet mean” of the images projected in a quotient space. It has been argued recently
11
that this type of algorithm could actually be asymptotically biased, therefore inconsistent. In other words,
12
even with an infinite number of brain images in the database, the template estimate may not converge to the
13
brain anatomy it is meant to estimate. Our paper investigates this phenomenon. We present a methodology
14
that quantifies spatially the brain template’s asymptotic bias. We identify the main variables controlling the
15
inconsistency. This leads us to investigate the topology of the template’s intensity levels sets, represented by
16
its Morse-Smale complex. We propose a topologically constrained adaptation of the template computation,
17
that constructs a hierarchical template with bounded bias. We apply our method to the analysis of a brain
18
template of 136 T1 weighted MR images from the Open Access Series of Imaging Studies (OASIS) database.
19
Introduction. In neuroimaging, as well as in many other medical image analysis domains, 20
a template is an image representing a reference anatomy. A template is computed from a 21
database of brain images to serve as the brain image “prototype” for further analyses. 22
Computation of a brain template. Various methods exist to compute a brain template [15]. 23
A first practice selects one brain image from the database as the template. If the selected 24
subject’s anatomy is far from the population mean anatomy, the template is necessarily bi-25
ased towards this specific individual. Thus, the template fails at being a prototype of the 26
population. This is why researchers consider the computation of an “unbiased template” that 27
represents the mean anatomy better. 28
Such an “unbiased” template is often constructed by performing an iterative averaging of 29
intensities and deformations [17, 24,20]. One initializes with a template chosen among the 30
subject images. Then, during each iteration, one registers the subjects to the current template, 31
and computes the mean deformation. The new template is computed as the mean intensity 32
of the subjects’ images, deformed with the mean deformation. This procedure does not favor 33
any subject’s image if it does not end in a local minimum. In this sense, the procedure is 34
called “unbiased”. 35
The computed brain template may look blurred or sharp depending on the design chosen 36
for the registration in the above iterative procedure. If the algorithm is designed using linear 37
registration, the template may appear blurred. In contrast, if one uses diffeomorphic regis-38
tration, the template is more likely to look sharp and the sharpness depends on the amount 39
of regularization used [15]. 40
Purpose and desirable properties of the brain template. Computing a template is often the 41
first step in medical image processing because of its many applications. The template is used as 42
a standardized 3D coordinate frame where the subject brains can be compared. The subjects 43
are then characterized by their spatial diffeomorphic deformations from the template. These 44
deformations may serve for a statistical analysis of the subject shapes [6]. One studies the 45
normal and pathological variations of the subjects with respect to the template. The deforma-46
tions also facilitate automated segmentation, by mapping the template’s already segmented 47
regions into each subject space. 48
What are the desirable properties of the brain template, with respect to the applications 49
mentioned above? First, the template should be representative of the population, removing 50
any bias toward a specific subject during the analysis [10,5,7]. Second, the template should 51
be sharply defined, so that subtle anatomical structures can be easily observed or segmented. 52
The brain template, an inconsistent estimator of the unique brain anatomy. However, the two 53
desirable properties cannot be fulfilled simultaneously: there exists a trade-off, akin to the 54
standard bias-variance trade-off in statistical learning. 55
First, this trade-off can be understood intuitively. Consider a database of brain images 56
divided into two groups that have different topologies. The first group has subjects with three 57
sulci - i.e. depressions or grooves in the cerebral cortex - in a specified brain region. The 58
second group has subjects with only two sulci in the same region. A sharply defined template 59
has to decide on a specific topology in this brain region, i.e. whether it shows two or three 60
sulci. Therefore, it might not estimate correctly the brain anatomy of this population, which 61
might be problematic for the following applications in neuroimaging. For example during 62
a statistical analysis, registering the subjects with the three sulci topology to a template 63
that has chosen a two sulci topology might not be reasonable [7]. A sharp template is only 64
meaningful if the anatomical structures shown are representative of the whole population, i.e. 65
if the population is homegeneous. 66
Second, the trade-off has been emphasized in recent studies investigating the bias of the 67
template as an estimator of the population’s shared anatomy. In the classical approach [2], an 68
initial assumption states that there is a unique (brain) anatomy shared by the population. The 69
subjects are then modeled through a generative model as random deformations of the unique 70
brain anatomy, observed with additional noise. The unique brain anatomy is a parameter of 71
this model. The template computation is interpreted as its estimation. One can ask about 72
its asymptotic bias: does the template converge to the unique brain anatomy for a database 73
with an infinite number of images? 74
This question has been investigated for signals, i.e. 1D images. Some authors prove the 75
asymptotic unbiasedness of the template under the simplifying assumption of no measurement 76
error on the observed signals [27]. Other authors have already provided examples of asymptotic 77
bias, and therefore inconsistency, when there is measurement error [2]. Their experiments 78
show that the template may converge to pure noise when the measurement error on simulated 79
signals increases. A bias is shown to occur in [8] for curves estimated from a finite number of 80
points in the presence of noise. 81
Recently, an asymptotic bias has been shown in the setting of Lie group actions [35,34]. 82
Our argument, shown in an abstract geometric context in [35,34] but adapted here to brain 83
images, is as follows. We look at the subspace defined by all brains with the same shape 84
as the unique brain anatomy. We show that the curvature of this space, at the scale of the 85
measurement noise, introduces a bias on the brain template. 86
Using topology to investigate the brain template’s asymptotic bias. We want to link: (i) 87
the fact that a population with two groups of different brain topologies cannot be accurately 88
represented by a sharp template, with (ii) the mathematical results on the template’s bias 89
as an estimate of the anatomy shared by the population. The framework of [35] is based on 90
the quotient of the space of the data space by the action of a Lie group. The data in our 91
case are the brain images, and the Lie group action is the action of diffeomorphisms on these 92
images. Quotienting the images by the action of diffeomorphisms amounts to filtering out 93
any information that is invariant by diffeomorphic deformations. Thus, the quotient gives the 94
topology of the images’ level sets. Using the intuition of [35], we could quantify the brain 95
template’s asymptotic bias using a representation of its topology. 96
Quantifying the bias could enable us to decide when and where a sharply defined template 97
makes sense. We could want a sharp template in brain regions where the intersubject anatom-98
ical variability is low and a fuzzier template when this variability is higher. Alternatively, we 99
could consider computing several sharp brain templates using mixtures. This discussion boils 100
down to the question: when is it reasonable to assume that a unique brain anatomy represents 101
the whole subject population? 102
Furthermore, we could think about controlling the brain template’s asymptotic bias by 103
constraining its topology. Topological representations of images have been used with various 104
objectives in the literature. For example, [12] uses a topological representation of brain images 105
for classification of autism versus normals. Topological constraints have also been implemented 106
for segmentation where the reconstruction of the cortical surface needs to match the brain 107
anatomy [30,31,21]. However, topological representation of images or topological constraints 108
on images have not been used to study and enforce a statistical property, like asymptotic 109
unbiasedness. 110
Contributions and Outline. We use a topological representation of images - the Morse-111
Smale complex - to investigate and control the asymptotic bias of the brain template. We 112
make three main contributions in this paper. First, we show how to combine geometry and 113
topology to tackle a statistical problem in neuroimaging. We provide conjectures at the 114
boundaries of the fields with sketches of their proofs. Second, we analyze the template as an 115
estimator of the brain anatomy and quantify the asymptotic bias. This leads us to discuss 116
the initial assumption of a unique anatomy. Third, we present an adaptation of the template 117
computation algorithm that bounds the bias, through topological constraints, at the price of 118
constructing a “smoother” template. 119
Section 1 presents the geometry and the topology of the template computation. We 120
emphasize the variables that describe the bias of the brain template. Section 2 presents 121
the chosen computational representation of these variables through Morse-Smale complices. 122
Section3 leverages the previous computational model to spatially identify the biased regions 123
of the template. We thus propose an adaptation of the template computation with topological 124
constraints bounding the bias. In Section4our methodology is used on the Open Access Series 125
of Imaging Studies (OASIS) database of T1-weighted MR brain images. 126
1. Geometry and topology for template estimation . This paper will not present mathe-127
matical results, rather we show how geometry and topology combine to formalize the template 128
computation algorithm and highlight required directions for further mathematical develop-129
ments. 130
1.1. Geometrization of the action of diffeomorphisms on images.
131
Brain images. We consider two- and three-dimensional images, whose domain Ω ⊂ Rd, with 132
d = 2, 3, is supposed compact. We adopt the point of view of images as square-integrable 133
functions I over the compact domain Ω, i.e. we write I ∈ L2(Ω), where L2(Ω) is a Hilbert
134
space. The corresponding L2 distance is invariant by volume preserving diffeomorphisms
135
(volumorphisms). Additionally, we assume that the images are in C∞(Ω), which is C∞defined 136
on the compact support Ω. We denote Img(Ω) the set of images. 137
To illustrate the following concepts, we use the toy Hilbert space R2 where one point
138
schematically represents one image, see Figure1. 139
Diffeomorphisms. A diffeomorphism of Ω is a differentiable map φ : Ω → Ω which is a 140
bijection whose inverse φ(−1) is also differentiable. We consider two sets of diffeomorphisms.
141
On the one hand, we consider C∞(Ω), i.e. the smooth diffeomorphisms that are the 142
identity outside a compact support. C∞(Ω) can be seen as an infinite dimensional manifold 143
[33] and forms an infinite dimensional Lie group [26]. Its Lie algebra V is the set of smooth 144
vector fields with compact support [26]. We use this set of diffeomorphisms to present algebraic 145
concepts. 146
On the other hand, we consider the set CId(Rd, Rd) defined as CId(Rd, Rd) = {φ = Id +
147
u for u ∈ Cb1(Rd, Rd)}, where the subscript “b” refers to functions that are bounded with 148
bounded derivatives. These diffeomorphisms are “small”, i.e. not too different from the 149
identity. We use this set to formulate mathematical conjectures which need metric properties. 150
If the specification between the two sets is not needed, we will refer to Diff(Ω) to denote 151
diffeomorphisms. 152
Action of diffeomorphisms on (brain) images. The Lie group of diffeomorphisms Diff(Ω) acts 153
on the space of images L2(Ω) [39]: ρ : Diff(Ω) × Img(Ω) → Img(Ω), (φ, I) 7→ φ · I = I ◦ φ−1.
154
This action is represented on Figure 1 (a), which shows an image I and its diffeomorphic 155
deformation. 156
Intuition and schematic representation on R2. The statistical analysis of this paper relies on 157
geometric considerations in the Hilbert space of images L2(Ω) endowed with the action of an
158
Lie group of diffeomorphisms Diff(Ω). The intuition on the geometry in these abstract spaces 159
will be given by a serie of figures, where: 160
L2(Ω): Hilbert space of images is represented as R2: 2D surface of the paper, 161
Diff(Ω): Lie group of diffeomorphisms is represented as SO(2): Lie group of 2D rotations. 162
163
and we consider the fundamental action of SO(2) on R2. 164
Figure1(b) shows this representation. The action of a diffeomorphism φ on I is represented 165
by the blue curved arrow, i.e. by the action of a 2D rotation. The action transforms the image 166
I into another image φ · I, i.e. into a different point in the Hilbert space. 167
We note that L2(Ω) with the action of Diff(Ω) and R2 with the action of SO(2) have
168
different properties. For example, R2 and SO(2) are finite dimensional and the action of
169
SO(2) is isometric with respect to the Euclidean distance on R2. In comparison, L2(Ω) and
170
Diff(Ω) are infinite dimensional and the action of Diff(Ω) is not isometric with respect to the 171
L2 distance between images of L2(Ω).
172
Nevertheless, we argue that our schematic representation makes sense for the following 173
reasons. First, representing infinite dimensional spaces by finite dimensional ones is standard 174
as it is difficult to draw infinite dimensions on a piece of paper. Second, we consider diffeo-175
morphisms that transform a brain image into another brain image, i.e. an image into one that 176
looks similar. Therefore, we restrict ourselves to ”small” diffeomorphisms. In this context, we 177
will see that we can consider their action as being isometric. 178 Action of the diffeomorphism φ (a) (b) I φ · I I φ · I = I ◦ φ(−1) Action of the diffeomorphism φ
Figure 1. Action of a diffeomorphism φ on a brain image I. (a) the brain image before and after the action of φ, (b) schematic representation of the action of φ on the brain image I, represented as a dot in R2.
Orbit OI of a (brain) image I. Here, we consider Diff(Ω) = C∞(Ω). The orbit OI of a
179
brain image I is defined as all images reachable through the action of diffeomorphisms on I: 180
OI = {I0 ∈ Img(Ω)|∃φ ∈ Diff(Ω) s.t. I0◦ φ−1= I}.
181
On Figure 2 (a), images I1 and I2 belong to the same orbit but I3 belongs to a different
182
orbit. We use R2 representing the space of images, with the action of SO(2) representing the 183
action of diffeomorphisms on Figure2(b). The blue dotted circle on Figure 2(b) represents O 184
the orbit of I1 and I2. This orbit defines a submanifold of images: in this toy illustration, the
185
submanifold is the blue dotted circle. The red point on Figure 2(b) represents OI3, the orbit
186
of the image I3. This orbit contains only one point and is a submanifold of dimension 0.
187
We note that the orbits may be infinite dimensional in the case of diffeomorphisms acting 188
on L2(Ω). Furthermore the orbits are not necessarily high-dimensional spheres as the action of
189
the diffeomorphisms is not isometric. However, by considering only “small” diffeomorphisms 190
acting on a given image I, we move locally on the orbit of image I. We could consider writing 191
a Taylor expansion of the orbit around I [34], where the first order gives its tangent space 192
and the second order is a high-dimensional sphere. Therefore, “small” diffeomorphisms are 193
consistent with a representation of the images’ orbits as spheres. 194
Isotropy group GI of a (brain) image I. Here, we consider Diff(Ω) = C∞(Ω). The isotropy
195
group GI of a brain image I is defined as the subgroup of Diff(Ω) formed by the
diffeomor-196
phisms that leave I unchanged: GI = {φ ∈ Diff(Ω)|I ◦ φ−1 = I}. GI describes the intrinsic
197
symmetry of the image I: the more symmetric is I, the larger its isotropy group. All im-198
ages on the same orbit have conjugate isotropy groups. Moreover, the isotropy group (also 199
called the stabilizer) and the orbit of an image are linked by the orbit-stabilizer theorem: 200
OI ∼ Diff(Ω)/GIin finite dimensions. The intuition is that the larger the isotropy group (and
201
thus, the more symmetry the image has), the smaller the orbit. Figure2(a) shows two brain 202
images: the isotropy group of I1 and I2 is larger than the isotropy group of I3, in the sense of
203
inclusion. As a consequence, the orbit of I3 is “smaller” than the orbit of I2.
204
We use again the representation of R2 to illustrate this notion. The notion of “isotropy 205
group” dictates the dimension of a given orbit, i.e. whether we have a 1-dimensional subman-206
ifold like the blue dotted circle or a 0-dimensional submanifold like the red point on Figure2
207
(b). The isotropy group of the image at the blue point on Figure2(b) is the identity. Only the 208
identity leaves this image at the same place. The isotropy group of the image represented by 209
the red point is the whole Lie group of 2D rotations. Any rotation leaves this point invariant. 210
Going back to diffeomorphisms, we note that the isotropy group may be of infinite dimen-211
sion: the isotropy group of a uniform image, i.e. I constant map over Ω, is the whole group 212
Diff(Ω). Nevertheless, the orbit-stabilizer theorem holds in infinite dimensions therefore we 213
can rely on the intuition of “smaller” isotropy groups and “larger” orbits. 214 I1 (a) I2 I3 I2 (b) I1 I3
Figure 2. Orbit and isotropy group of a brain image. (a) I1 and I2 belong to the same orbit: I2 is a diffeomorphic deformations of I1. They have conjugate isotropy groups. In contrast, I3 belong to a different orbit and has a different isotropy group. I3shows more symmetry than I1 and I2, thus a larger isotropy group, whereas the more asymmetric details of I1, I2 are the sign of a smaller isotropy group. (b) The orbit of I1, I2 is represented as the blue dotted circle and the two images I1, I2 are points on this circle. The isotropy group is linked to the dimension of the image’s orbit. I1, I2 have a smaller isotropy group, they have a circle orbit. I3 has a larger isotropy group: its orbit is itself, i.e. the red point at (0, 0).
1.2. From geometry to topology.
215
Topology of (brain) images. The topology of a brain image I is defined as the topology of 216
its level sets, these surfaces of Ω with constant intensity. topology refers to properties that are 217
preserved under smooth deformations [16], i.e. conserved by the action of diffeomorphisms on 218
I: for example, the number of holes, or the number of connected parts, see Figure3(a). 219
Geometry and topology combine as follows. Two images I and I0 that are diffeomorphic 220
deformations of each other, i.e. that are on the same orbit, have the same topology. The orbit 221
OI itself represents the topology of image I (and I0). The set of orbits Q = {OI|I ∈ Img(Ω)},
222
which is the quotient space of Img(Ω) by the action of Diff(Ω) is the set of the topologies. 223
Figure 3(b) shows how the space of images R2 is partitioned into orbits: blue circles and 224
one red “singular circle”, the red point at (0, 0). Figure 3(b) also shows R+, the quotient
225
space of R2 by the group of 2D rotations, which schematically represents the quotient space 226
of the space of brain images Img(Ω) by the Lie group of diffeomorphisms Diff(Ω). Each of the 227
four blue circles in R2 becomes a blue point in the quotient space R +.
228
(b) (a)
Figure 3. (a) Images in the same colomn have same topology, images in the line have different topologies: one cannot be diffeormorphically deformed to match the other. (b) Top: schematic representation of the space of images partitioned into orbits. The two different orbit types are in blue and red respectively. Bottom: schematic representation of the quotient space of brain images Img(Ω) by the Lie group of diffeomorphisms Diff(Ω).
Gathering brain images with similar topologies: orbit types. By definition, two brain images 229
are of the same orbit type if their isotropy groups are conjugate subgroups in the Lie group of 230
diffeomorphisms. In particular, brain images that belong to the same orbit have same orbit 231
type. The type corresponding to the smallest isotropy group - in the sense of the inclusion 232
among the subgroups of the diffeomorphism group - is sometimes called the principal type [1]. 233
We use this appellation in this paper. For example, if an orbit has type the identity of the 234
group of diffeomorphisms, then it is necessarily of principal type. Equivalently, the orbits of 235
principal type are called principal orbits. Other orbits are called singular orbits. 236
The blue circles on Figure3(b) have same orbit type: the images on these orbits have the 237
identity {Id} of the Lie group as isotropy group. They are of principal type since the identity 238
{Id} is the smallest subgroup - in the sense of the inclusion - of the group of rotations. 239
Stratification of the space of topologies. In the space of brain images, we can gather orbits 240
of same orbit type: we gather the blue circles of Figure 3(b) into R2 \ {(0, 0)} on the one 241
hand, and keep the red dot (0, 0) on the other hand. The orbit type itself is a submanifold 242
of the space of brain images: R2\ {(0, 0)} or (0, 0) in the schematic brain images space R2.
243
Furthermore, these orbit type submanifolds form a stratification, meaning they fit together in 244
a particularly nice way [38]. 245
The quotient space Q is also naturally partitioned into manifolds, and this partitioning is 246
also a stratification. All in all, Q is not a manifold, but Q composed of manifold pieces, and 247
those pieces are called strata. There is a partial ordering of the strata in the quotient space, 248
using the inclusion [22]. 249
Figure 3(b) uses the analogy of R2 as the space of images and SO(2) as the Lie group 250
acting on it. It shows the orbits grouped by orbit type: the color blue denotes one orbit type 251
and the color red another orbit type. We see for example that Q = R+ is stratified into one
252
stratum being R∗+ - corresponding to the stratum R2\ {(0, 0)} in the space of brain images
-253
and one stratum being {0} - corresponding to the stratum (0, 0) in the space of brain images. 254
1.3. Geometry of generative model and estimation procedure. In this subsection, we 255
consider: Diff(Ω) = Cb1(Rd, Rd). 256
Generative model. The n brain images I1, ..., In are interpreted with a generative
de-257
formable model: Ii = φi· T + i, i = 1...n, where each image Ii ∈ Img(Ω) is a diffeomorphic
258
deformation φi ∈ Diff(Ω) of a unique brain anatomy T , to which noise i is added. The
pa-259
rameter T represents the brain anatomy shared by the population. The transformations φi’s
260
and the noises i’s are i.i.d. realizations of random variables. The transformations φi’s follow
261
a general law, which could be for example a Gaussian law in a finite-dimensional subspace of 262
the Lie group and the i’s represent Gaussian noise on the space of images. We denote σ2 its
263
variance. Definitions of distributions on finite dimensional Riemannian manifolds are taken 264
from [36] and the Gaussian distributions for infinite dimensional spaces from [28]. 265
The model can be interpreted by a three step generative procedure illustrated schemat-266
ically in Figure 4. First, there is only the shared anatomy T . Second, the template T is 267
deformed with the diffeomorphism φi and gives a brain image φi· T . Third, we represent the
268
measurement noise with i, which gives the observed brain image Ii.
269 T T T σ φi φi· T i φi· T + i (a) (b) (c)
Figure 4. Schematic illustration of the generative model of the brain images data. As before, the space of brain images is represented by the plane R2. (a) First step of the generative model: generate a brain anatomy. One usually assumes that there is a unique brain anatomy: T , in green. (b) Second step of the generative model: generate a deformation φi ∈ Diff(Ω) which is used to deform the template. The brain image φi· I belongs to the orbit of T , represented by the green circle. (c) Third step of the generative model: generate noise iin the space of images. The brain image φi· T + i does not belong to the orbit of T anymore.
Computing the template: an estimation procedure. Computing the brain template amounts 270
to invert the generative model: given the data, we want to estimate the parameter T . The 271
transformations φ’s are hidden variables of the model. The natural statistical procedure to 272
estimate T in this context is the Expectation-Maximization (EM) algorithm [13]. The EM is 273
an iterative procedure that maximizes the log-likelihood of the generative model with hidden 274
variables. As such, the EM gives an asymptotically unbiased and consistent estimation of the 275
brain anatomy T . 276
In practice, one does not use the EM algorithm because it is computationally expensive, 277
especially when dealing with tridimensional images. Most neuroimaging pipelines rely on an 278
approximation of the Expectation-Maximization algorithm to estimate the brain anatomy, 279
called “Fast Approximation with Modes” in [2]. It runs as follows. Initialize the estimate 280
with ˆT = I1, i.e. one of the brain images from the database. Then, iterate the following two
281
steps until convergence [24]: 282
(1) φˆi = argmin φ∈Diff(Ω)
dImg(Ω)( ˆT , φ · Ii) + λReg(φ), ∀i ∈ {1, ..., n},
283 (2) T = argminˆ T ∈Img(Ω) n X i=1 dImg(Ω)(T, ˆφi· Ii)2. 284 285
Step (1) is an estimation ˆφi of the diffeomorphisms φi, and an approximation of the E-step
286
of the EM algorithm. In practice, each brain image Ii is registered to the current template
287
estimate and the ˆφi is the result of this registration. The term Reg is a regularization that
288
ensures that the optimization has a solution. Showing that the estimates ˆφi’s and ˆT exist is
289
technical and beyond the scope of this paper. 290
Step (2) is the M-step of the EM algorithm: the maximization of the surrogate in the 291
M-step amounts to the maximization of the variance of the projected data. This computes 292
the updated template estimate, as the mean intensity of the subjects images Ii, deformed with
293
the mean deformation of the ˆφi’s.
294
The registration step (1) amounts to aligning the n subject images by transporting them 295
onto their orbit (see Figure 5(b)), i.e. projecting them in the quotient space (see Figure 5
296
(c)). Step (2) averages the n registered images (see Figure5 (d)). 297
(a) (b) (c) (d)
Bn?
Figure 5. Geometrization of an iteration of the template’s computation: (a) n subjects images (black squares); (b) the n images are registered, they travel on their orbit (the blue circles) to get aligned; (c) registered images; (d) the empirical brain template ˆT (in yellow) is computed as the Fr´echet mean of the n registered images. How far is it from the unique anatomy T of the generative model (in green): can we quantify Bn?
Evaluation of the procedure: definition of asymptotic bias B∞. We evaluate the template
298 ˆ
T as an estimator of the unique brain anatomy T (see Figure 5(d)) given n observations Ii,
299
i = 1...n. We note that in other papers, T may be called the template directly and ˆT the 300
template’s estimate. 301
We consider two measures of the accuracy of this estimator: its variance Vn2 and bias Bn,
302
which are defined as: 303
Vn2= E(( ˆT − E( ˆT ))2) and Bn= E( ˆT − T ).
304 305
The bias for n images Bn is illustrated on Figure 5. We are interested in the asymptotic
306
behavior n → +∞, i.e. when the number of brain images goes to infinity. One would expect 307
that the procedure converges to the brain anatomy T it is designed to estimate. When the 308
estimator converges, its variance is asymptotically zero: V∞2 = 0.
309
1.4. Geometry of the template estimator’s evaluation.
310
Estimation procedure interpreted as the Fr´echet mean in the quotient space. We consider 311
the estimation procedure of Equation1.3. First, we note that the regularization term in step 312
(1) forces the diffeomorphisms to be small, i.e. to be close to the identity and close to have 313
an isometric action on the images. Second, we could consider the group of volumorphisms, 314
the subgroup of diffeomorphisms that leave a volume form V invariant and is used to model 315
incompressible fluids in continuum mechanics. In this case, using y = φ−1(x) and the fact 316
that the volume form is conserved: 317 (1) d(φ·I1, φ·I2)2 = Z Ω (I1◦φ−1(x)−I2◦φ−1(x))2dV (x) = Z Ω
(I1(y)−I2(y))2dV (y) = d(I1, I2)2
318
and the action is thus isometric. 319
In both of the above two frameworks, it is thus reasonable to trade the regularization term 320
for the assumption that the action is isometric in our modelization. 321
First, we model the estimation procedure as follows. 322 (1) φˆi= argmin φ∈G dM( ˆT , φ · Ii), ∀i ∈ {1, ..., n}, 323 (2) T = argminˆ T ∈M n X i=1 dM(T, ˆφi· Ii)2. 324 325
where M is a generic Riemannian manifold and G a Lie group acting on M isometrically. 326
This converges to a local minimum because it decreases at each step a cost bounded below 327
by zero. The estimator computed with this procedure is: 328 (2) T = argminˆ T ∈M n X i=1 min φ∈Gd 2 M(T, φ · Ii). 329
The term min
φ∈Gd 2
M(T, φ · Ii) is the distance in the quotient space between T and Ii. Thus
330
Equation 2defines the Fr´echet mean on the quotient space [36]. 331
The study of the set of solutions of algorithms in Equation 1.3 and Equation 1.4 is an 332
interesting direction of research but beyond the scope of this paper. However, we point out 333
that the existence and uniqueness of the solution of algorithm 1.4 is linked to the question 334
of whether the Fr´echet mean exists and is unique in the quotient space, which is studied for 335
example in [25,4]. 336
Asymptotic bias B∞ and curvature. We show in [35] that the asymptotic bias is non zero:
337
B∞ 6= 0. For an infinite number of brain images n → +∞, the estimate converges, but not
338
to the brain anatomy T it was designed for. We compute in [35] a Taylor expansion of the 339
asymptotic bias B∞ around the noise σ = 0, in the case of a finite dimensional manifold and
340
isometric Lie group action [34]: 341 (3) B∞= σ2 2 H(T ) + O(σ 3) + (σ) 342
where H(T ) denotes the mean curvature vector of the template’s orbit. There is no bias 343
when there is no measurement error σ = 0. It was observed experimentally that the bias was 344
dependent on the measurement error [2]. 345
The coefficient H(T ) depends on the template T that is being estimated. We investigate 346
this dependency in the geometric framework of Subsection 1.2. Assume that there exists a 347
fixed point o of the Lie group action, i.e. a point that is invariant by the whole Lie group. 348
Consider the orbit OT of T . As the action is isometric, the orbit belongs to a geodesic
349
sphere Sd with center o and radius d. A geodesic sphere of radius d in a manifold - like a
350
hypersphere of radius d in Rm - has mean curvature vector of norm: ||H(T )|| = (m−1)d . If 351
we write the template’s bias in the units of d, then the asymptotic bias depends on σd2. 352
In other words, the distance of the template to the singularity o, at the scale of the noise σ 353
governs the asymptotic bias B∞. The approach of this paper is to compute the asymptotic
354
bias by estimating the geometric parameters. Conversely, one could compute the geometric 355
parameters, like the external curvature of orbits for different Lie groups, from the asymptotic 356
bias known on simulated examples. 357
Figure6shows the intuition behind this. On Figure 6(a), R2 schematically represents the 358
space of brain images - the black squares represent brain images from the database, the green 359
square is T - and the green circle is the orbit of T . The dotted circles, that have their centers 360
on the template’s orbit, represent the probability distribution of the (2D isotropic) Gaussian 361
noise in the generative model. More precisely, they represent the level set at σ of the noise 362
distribution. The curvature H(T ) controls the area in grey on Figure 6, which is the area 363
inside the Gaussian level set that is outside T ’s orbit. This area is greater that the area inside 364
T ’s orbit. As a consequence, the probability that the brain images are generated “outside” 365
T ’s orbit is higher than the probability that they are generated inside T ’s orbit. 366
Figure 6(b) shows the registration step of the template estimation: there is a higher 367
probability that the registered images are away from T , as if repulsed from the singularity 368
around which the orbits warp. When one averages the registered images, one sees that the 369
template’s estimate becomes biased as it will systematically give an image that is further away 370
than T from the quotient space’s singularity, i.e. from the red dot. 371
Quantifying the asymptotic bias B∞of the brain template. In neuroimaging, the manifold is
372
the space of brain images Img(Ω) and the Lie group is the group of diffeomorphisms Diff(Ω), 373
both infinite dimensional. This paper assumes that we can apply the geometry of [34], because 374
there are indications that B∞ appears in the same fashion in infinite dimension, see Figure7.
375
First, [34] studies the bias when the dimension of the manifold increases. They consider 376
the finite dimensional manifold M = Rm with the action of SO(m), i.e. a generalization of 377
d σ
(a) (b)
Figure 6. Schematic illustration of the asymptotic bias in the template computation algorithm of neu-roimaging. (a) The images generated with the model described above have a higher probability to be“outside” (with respect to the curvature) of the template’s orbit. (b) The registration during template’s estimation aligns the images. The distribution of images is unbalanced with respect to the real template.
Figure 7. Image from [34]. Take M = Rm with the action of SO(m). The template’s bias increases with σ and is more important as m increases: the blue curve shows the asymptotic bias for m = 2, the pink curve for m = 10 and the yellow curve for m = 20.
the toy example R2 with the action of SO(2) from our illustrations. We show in [34] that B∞
378
increases when m increases, see Figure7. 379
Then, the work of [3] shows that there exists an asymptotic bias in an infinite dimensional 380
Hilbert space. The work of [14] provided an asymptotic behavior of the bias when the noise 381
level σ tends to infinity. This bias is exemplified on examples of template of signals in [14], 382
Figure 8. Image from [14]. The real signal is shown in blue. The template, computed with n = 105 observations simulated with Gaussian noise with σ = 10, is shown in red. There is an asymptotic bias in the estimation of signal.
see Figure 8. Although the 1D signals are discretized for the numeric implementations, they 383
represent 1D functions that are elements of an infinite dimensional Hilbert space. 384
Ultimately, [9] gives a lower bound of the asymptotic bias for shapes of curves in 2D, 385
where terms depend on derivatives of the functions representing the curve. These derivatives 386
can be interpreted as the derivative of the action of translations, which leads to the curvature 387
of the orbit of the given function under the translations’ action. As a consequence of these 388
examples, we assume that the intuition provided by [34] applies to neuroimaging and that the 389
bias depends on the ratio σd2 . 390
2. Computational representation of geometry and topology . Now that we have seen 391
that the geometrization of template estimation leads us to investigate the topology of images, 392
we show in this section how the Morse-Smale complices can be used to encode the topology 393
and help the estimation of the geometric parameters d and σ previously introduced. 394
2.1. Definition of Morse-Smale complexes for (brain) images.
395
Morse-Smale (intensity) functions. A real-valued smooth map I : Ω → R is a Morse function 396
if all its critical points are non-degenerate (the Hessian matrix is non-singular) and no two 397
critical points have the same function value. The intensity function I representing a bi- or 398
tri-dimensional brain image is a Morse function, at least after a convolution with a smoothing 399
Imin Imax (i) (ii) p1 p2 (a) (b) (c)
Figure 9. (a) Intensity I on a 2D brain image visualized as height: maxima are in red, minima in blue. This is a Morse-Smale function I : Ω = R2
→ R. (b) Persistences p1< p2 of two pairs min-max. A threshold p1< p < p2 divides the domain into 3 regions (pink, violet, green), while p < p1 divides it in 5 regions (pink, violet, brown, turquoise, green). (c) Computational representation of the geometry: Regions on a 2D domain, induced by a MSC with threshold p = 0.1: (i) The MS graph represents the isotropy group’s class of the image, (ii) the labeled MS graph represents the image’s orbit under the diffeomorphisms
Gaussian [11]. In the following, I represents a brain image. Morse theory traditionally 400
analyzes the topology of a manifold by studying the Morse functions on that manifold. Here, 401
the manifold is known: it is the image domain Ω. We are not interested in the topology of Ω 402
but rather in the topology of the functions I themselves, that is: we would like to know the 403
distribution of their critical points. Figure9(a) shows a 2D slice of a 3D brain image I, where 404
the intensity is represented as the height, to better emphasize its maxima and minima: the 405
maxima are in red and the minima in blue. 406
We introduce the notions of integral lines, ascending and descending manifolds that are 407
needed to define Morse-Smale (intensity) functions. An integral line is a maximal path in the 408
image domain Ω whose tangent vector correspond to the intensity gradient ∇I, the gradient 409
of I, at every point. This notion comes from autonomous ordinary differential equation, where 410
it represents the trajectory of a system verifying: 411
(4) dx
dt(x) = ∇I(x) 412
Each integral line starts and ends at critical points of I, where the gradient ∇I is zero. 413
Ascending A(xi) and descending D(xj) manifolds of respective extrema xi and xj are defined
414 as: 415
A(xi) = {x ∈ Ω | The integral line going through x ends at xi}
416
D(xj) = {x ∈ Ω | The integral line going through x starts at xj}
417 418
Take the two manifolds A(xi) and D(xj) in Ω and assume they intersect at a point p ∈ Ω.
419
Let TA(resp. TD) denotes the set of all vectors tangent to A(xi) (resp. D(xj)) at p. If every
420
vector in Ω is the sum of a vector in TA and a vector in TD, then A(xi) and D(xj) are said
421
to intersect transversely at the point p. The intensity function I defining the brain image 422
is Morse-Smale if the ascending and descending manifolds only intersect transversely. We 423
assume in the following that all brain images I are Morse-Smale. 424
Morse-Smale complex and persistence. The Morse-Smale complex of a Morse-Smale func-425
tion is the set of intersections A(xi) ∩ D(xj), over all combinations of extrema (xi, xj) [19].
426
The Morse-Smale complex includes regions (i.e., sub-manifolds of Ω) of dimensions 0 through 427
D, where D is the dimension of the domain Ω, i.e. D = 2 or D = 3 for our purposes. The 428
Morse-Smale (MS) complex of I is a partition of the domain Ω into regions defined by the set 429
of integral lines that share common starting and ending points. The interior of each region 430
is monotonic with respect to the intensity I: a region contains no critical points and has a 431
single local minimum and maximum on its boundary, see for example Figure 9(c)(ii) where 432
the maximum Imax and minimum Imin are shown on the boundary of the grey region. The
433
MS complex can also be seen as a graph on the brain image domain Ω whose nodes are the 434
critical points of the brain image intensity. 435
The persistence of a critical point xi of I is the amount of change in intensity I required
436
to remove this critical point: 437
(5) p(xi) = |I(xi) − I(n(xi))|
438
where n(xi) is the critical point closest to xi in intensity, among the critical points connected
439
to xi by an integral line [16]. The persistence of xi is a measure of its significance as a critical
440
point, i.e. importance of the topological feature. Figure 9(b) illustrates the definition of 441
persistence on a 1D example. The function represented has 4 critical points: two minima and 442
two maxima. The figure shows how they pair, as well as their persistence. On the x-axis, 443
colors show the regions of the corresponding 1D Morse-Smale complex. 444
Beside this usual definition of persistence of a critical point, we define here the persistence 445
of a region of the Morse-Smale complex as the amount of change in intensity required to 446
remove this region from the MS complex, and more precisely: 447
(6) p(region) = |Imax− Imin|
448
where Imax and Imin are respectively the maximum and the minimum in intensity of this
449
region. In contrast to the definition of the persistence of a critical point, we do not rely on 450
the saddle points, but only on the extrema i.e. the minima and maxima. 451
Hierarchy of Morse-Smale complexes. The notion of persistence of a region enables the 452
definition of a hierarchy of MS complexes of one brain image I [19,16]. One uses the ordering 453
given by persistence to successively remove topological features from the image I. One starts 454
with the MS complex of the brain image I defined above and one recursively removes the 455
critical points with minimal persistence. This leads to a nested series of successively simplified 456
Morse-Smale complexes. At each level, some of the MS regions are merged into a single region. 457
Ultimately the Morse-Smale complex consists of only one region which is the entire domain 458
Ω. 459
The persistence introduces a notion of scale at which the Morse-Smale complex of I is 460
considered. One keeps only the nodes whose persistence is above the threshold. Figure 9(b) 461
shows that the one dimensional domain is partitioned differently if one takes a threshold p 462
below p1 or between p1 and p2. We say that a Morse-Smale complex is represented at a
463
given persistence level. At the scale of the persistence threshold p, the intensity is considered 464
monotonic on each region of the MS. 465
We note that this MS hierarchy is different from a Gaussian scale space (GSS) hierarchy of 466
images [37]. The latter takes critical points across smoothing scales and not across persistence 467
levels. 468
2.2. Computing Morse-Smale complexes of (brain) images in practice. The previous 469
definitions are relevant to (continuous) Morse-Smale theory and apply strictly for a continuous 470
intensity function I. Nevertheless, the MS complex, introduced in terms of ascending and 471
descending manifolds, can be computed for discrete brain images as follows [16]. We choose 472
the approach of line integrals to compute the Morse-Smale complex. Other approaches may 473
also be considered like the Delaunay triangulation [12]. 474
Computing the Morse-Smale of a brain image. We compute the Morse-Smale complex of a 475
brain image, which will later be the brain template image. Our input are {xi, Ii}, i.e. the
in-476
tensity values {Ii} on a grid {xi} of Ω. We compute the integral lines of the intensity gradient,
477
which we then gather to get the regions of the Morse-Smale complex. For each element of the 478
grid xi, following the gradient ∇I leads to computing the integral line going through xi and
479
in particular its starting and ending points [16]. The domain Ω can be approximated via a 480
k nearest-neighbor graph and one computes the integral lines by considering the connectivity 481
of the graph. Then, elements xi’s with same starting and ending points belong to the same
482
Morse-Smale region. This gives the partition of the domain Ω and therefore the Morse-Smale 483
complex. We remark that xi’s necessarily belong to a 3-dimensional (for a tridimensional
im-484
age) component of the Morse-Smale complex because the 0-, 1- and 2-dimensional components 485
have measure zero. 486
Figure 9(c) shows the Morse-Smale complex of the 2D slice of a 3D brain image for level 487
of persistence of p = 0.1. The image’s 2D domain is divided in different regions, represented 488
by the different colors. The quadrant shows part of the underlying Morse-Smale graph. The 489
red dot represents a maximum in intensity, and the blue dot a minimum of intensity. They 490
are nodes of the underlying graph on the domain Ω. 491
Morse-Smale (MS) graph and labeled Morse-Smale (MS) graph. There are two ways of 492
representing the Morse-Smale graph corresponding to the computed Morse-Smale complex. 493
Both will be useful for analyzing the template’s asymptotic bias. One can consider the graph 494
as the set of nodes and edges, without any intensity information at the nodes. We simply 495
call this graph the Morse-Smale (MS) graph: this is the graph illustrated on Figure 9(c)(i). 496
Alternatively, one can label the nodes with the intensity information. We call this graph the 497
labeled Morse-Smale (MS) graph: this is the graph illustrated on Figure9(c)(ii). Both of these 498
graphs are oriented, the edges being directed in the direction of the intensity gradient from a 499
node to the next. 500
2.3. Template’s computation and Morse-Smale complexes. We show how the MS com-501
plex of an image can represent its geometry and in particular isotropy group. 502
Lie algebra of the isotropy group and intensity gradient of the brain template. The template 503
is an image I ∈ Img(Ω). We consider the Lie group of diffeomorphisms Diff(Ω) = C∞(Ω) and 504
its Lie algebra V , see Subsection1.1. 505
Lemma 1. Take > 0 and consider the Lie algebra: 506
(7) VI = {v ∈ V s.t.: I ◦ Exp(tv) = I, ∀|t| < }
By construction, the exponential of the elements of VI are in the isotropy group of I. For 508 v ∈ VI, we have: 509 (8) ∀x ∈ Ω, ∇I(x)T.v(x) = 0 510
Proof. Take v ∈ VI and |t| < . Its group exponential is a diffeomorphisms in the isotropy
511
group GI, which can be written:
512
φ = exp(tv) = Id + tv + O(t2) 513
514
Then, the equation above leads to: 515
I(x) = I(x − tv(x) + O(t2)) 516
I(x) = I(x) − DI(x). (tv(x)) + O(t2) 517
0 = DI(x). (tv(x)) + O(t2) 518
519
The identification of the coefficients in this Taylor expansion leads to: 520
∇I(x).v(x) = 0 521
522
A vector field of VI, which is in the Lie algebra of the isotropy group of the image I, is
523
perpendicular the image’s gradient at any point x of the image’s domain Ω. 524
We note that this lemma does not give a characterization of the vector fields in VI. It gives
525
the inclusion: VI ⊂ {v|∀x ∈ Ω, ∇I(x).v(x) = 0}. Thus, it allows to control the complexity of
526
VI, and thus of some of the isotropy group’s Lie algebra. It represents a first step towards the
527
following conjecture. 528
Conjecture 1. The Lie algebra of the isotropy group GI of the brain image I is constituted
529
of vector fields that are everywhere perpendicular to the image’s gradient. 530
Sketch of Proof 1. The proof will study the relations between the following three sets of 531
vector fields: 532
- the Lie algebra gI of GI,
533
- the set VI= {v ∈ V s.t.: I ◦ Exp(tv) = I, ∀|t| < },
534
- the set V⊥= {v|∀x ∈ Ω, ∇I(x).v(x) = 0}.
535 536
• Proving gI = VI by proving the double inclusion: (a) gI ⊂ VI and (b) VI⊂ gI.
537
The inclusion (a) may be difficult to prove directly because of the lack of characterization 538
of the infinite dimensional Lie algebra gI. The inclusion (b) could make use of the above
539
Lemma which shows that Exp(VI) ⊂ GI. We need to use the group logarithm in order to
540
transform an inclusion on Lie groups into an inclusion on Lie algebra. The group logarithm is 541
defined as the reciprocal of the group exponential, on its domain of bijectivity. As the domain 542
of bijectivity is not well characterized in the general case, it is difficult to make use of this. 543
• Proving VI = V⊥ by proving the double inclusion: (a) VI ⊂ V⊥ and (b) V⊥⊂ VI.
544
The inclusion (a) is proven in Lemma 1. The inclusion (b) may be proven in two steps. 545
First, we would write the full Taylor expansion of I ◦ Exp(tv) where v is an element of V⊥and
show that the orders above the 1st give 0. At this point, the expression of the Taylor expansion 547
is unclear, as well as if the condition v ⊥ ∇I is sufficient to make the orders above the 1st 548
make 0. In a second step, we would need to translate the inclusion written in terms of Lie 549
groups as an inclusion written in terms of Lie algebra, which is not trivial according to the 550
paragraph above. 551
To understand the intuition behind this conjecture, consider a uniform image, i.e. with 552
a constant intensity. In this case, there is no restrictions a priori on the vector fields of the 553
Lie algebra of the image’s isotropy group. Thus, the isotropy group is as large as it can be. 554
We get that the isotropy group of an image with constant intensity is the whole group of 555
diffeomorphisms. 556
Intensity gradient and MS graph. We now present a lemma showing that the MS graph can 557
be used to computationally represent the isotropy group of an image. Take two images I1 and
558
I2. Assume that their MS graphs are the same, regardless of the nodes’ positions.
559
Lemma 2. Assume we can map the level sets of I1 to the level sets of I2. This implies that
560
we can map the partition of the domain I1 induced by its Morse-Smale to the partition of the
561
domain of I2.
562
Take a part w ⊂ Ω of this partition. There exists a diffeomorphism ψ and a function κ 563
such that: ∇I2(x) = κ(x).d∗ψ(x).∇I1◦ ψ(x), ∀x ∈ w.
564
Proof. We consider a part w ⊂ Ω of the partition of the domain of I1 induced by its
565
Morse-Smale complex. On this part, there exists a function f such that: 566
(9) I1 = f ◦ I2◦ ψ
567
The function f gives the mapping of intensity levels on each level set. Differentiating the 568
previous equation with the chain rule gives: 569
(10) dI1= df (I2◦ ψ).dI2◦ ψ.dψ
570
where df (I2◦ ψ) is a scalar which we note κ. Taking the adjoint gives:
571
(11) ∇I1 = κ.∇I2◦ ψ.d∗ψ
572
This lemma is the first step towards the following conjecture. 573
Conjecture 2. Two images with same MS graphs have same isotropy group. 574
Sketch of Proof 2. The images I1 and I2 have the same MS graph. The graph of I1, taken
575
with the nodes and edges positions on Ω, can be diffeomorphically deformed on the graph of 576
I2. We take ψ1 a diffeomorphism that realizes the graphs’ matching.
577
Now, I1◦ ψ(−1)1 and I2 share the same MS graph, taken with the nodes and edges’ positions
578
on Ω. We consider one cell of this graph. 579
We consider the integral lines of the respective gradients ∇ψ1.∇I1◦ ψ1 and ∇I2 on the
580
cell. Both define a “parallel” partition of the cell. As a consequence, the set of integral lines 581
of the gradient of I1◦ ψ1(−1) can be mapped diffeomorphically to the set of integral lines of the
582
gradient of I2. We take ψ2 a diffeormorphism that realizes the matching of the integral lines.
At this step, the notion of parallel partition would need to be written as a definition, together 584
with the proof of the existence of ψ2.
585
Here, we would need to combine the ψ2’s obtained on the different cells of the MS complex
586
into one transformation on Ω that we also write ψ2. A study of the behavior of ψ2 on the
587
edges of the MS complex should show that ψ2 is itself a diffeomorphism on Ω. Then, we write
588
ψ = ψ2◦ ψ1. At the end of this step, I1 is transformed to I1◦ ψ(−1).
589
The gradients of I1◦ ψ(−1) and I2 share the same integral lines. Therefore ψ also maps
590
the level sets of I1 to the level sets of I2. Using Lemma 2, we conclude.
591
In others words: if two images I1, I2 have the same MS graph, then I1 can be
diffeo-592
morphically deformed so that its intensity gradient is parallel at every point to the intensity 593
gradient of I2.
594
From Lemma1, the sets {v1|∀x ∈ Ω, ∇I1(x).v1(x) = 0} and {v2|∀x ∈ Ω, ∇I2(x).v2(x) = 0}
595
control the isotropy groups of I1 and I2. If I1 and I2 have same MS graph, any vector field
596
in the first set can be diffeomorphically deformed to get a vector field in the second set, and 597
conversely. As a consequence, the MS graph represents the image’s isotropy group. From 598
Section1, we know that the isotropy group controls, in turn, the orbit’s type of the image i.e. 599
to which stratum the image belongs. 600
Furthermore, we note that we could have considered the labeled MS graph of the image, 601
i.e. the MS graph with intensities at the nodes, see Figure 9(c)(ii). The labeled MS graph 602
controls the orbit of the image: images in the same orbit have the same topology but also 603
same intensities. 604
3. Topology quantifies and controls the template’s asymptotic bias. This Section gath-605
ers the elements of Sections 1 and 2 to quantify the asymptotic bias in the brain template 606
computation. We use Morse-Smale complexes to quantify, and then control, the bias. 607
3.1. Quantify the template inconsistency.
608
Understand and estimate the geometric parameter d. The distance d is the distance of 609
the current image to a brain image with larger isotropy group, measured in sum of squared 610
differences of intensities, see Figure 6. How can we measure this distance d locally on the 611
template’s image? From Section1, we know that the isotropy group becomes larger when the 612
image is “more symmetric”. From Section2, we know that the isotropy group becomes larger 613
when the image topology becomes simpler. Thus, the distance d is a distance in intensity 614
from the template image to a similar image with simpler topology. 615
We want to express this distance locally on the template image. Modifying the intensity 616
locally on the template image modifies the image itself and may simplify its topology. For 617
example, modifying the intensity locally in a region of the image can suppress a min-max pair 618
and the image becomes “more symmetric”. Thus we describe the distance d locally on the 619
image by the amount of intensity needed to be changed in this region, so that the topology is 620
simplified. 621
We quantify the local intensity needed to simplify the template image’s topology using 622
the Morse-Smale complex representation of Section2. Let be given the Morse-Smale complex 623
of the template image. The intensity needed to simplify the image’s topology is, by definition, 624
the intensity needed to simplify the Morse-Smale graph. We consider the partition of the 625
image’s domain Ω induced by the Morse-Smale complex. For each region of the partition, 626
the intensity needed to simplify the topology can be represented by the amount of intensity 627
needed to remove the min-max pair of the region: 628
(12) d(region) = ˆˆ Tmax− ˆTmin = pTˆ(region)
629
This quantifies the importance of the region as a representative of the brain anatomy: if the 630
intensity difference between the region’s min and max is low, then one can assume that this 631
min-max pair has been created by chance because of the noise on the images. We see that 632
the notion of persistence defined in Section 2estimates the first geometric parameter d. 633
Understand and estimate the geometric parameter σ. Now we turn to the second geometric 634
parameter that causes the asymptotic bias: the standard deviation σ of the noise, see again 635
Equation3and Figure6. The standard deviation σ of the noise is a parameter of the generative 636
model that we assume has produced the observed images of the subjects brain anatomies. The 637
parameter σ is unknown but it can be estimated from the observed images. Since we want to 638
compute the asymptotic bias locally, we are interested in estimating the parameter σ locally, 639
and for example on a region of the Morse-Smale complex of the template image. We estimate 640
it as the average of the variability in intensity of the registered images in the region: 641 (13) σ(region) =ˆ 1 #x X x∈region ˆ σ(x), 642
where ˆσ(x) is the variability in intensity of the registered images at the voxel x, and serves as 643
an estimate of the noise at this voxel. This quantifies the amount of noise in this region. The 644
larger the standard deviation σ of the noise, the more chances for the template to show min-645
max pairs that appeared by chance. One could use other estimator of the standard deviation, 646
for example the sample standard deviation. Future work may investigate these estimators and 647
their impact on the estimator of the template’s bias. 648
Compute the asymptotic bias using the persistence of the whitened brain template. The local 649
estimates of the geometric parameters d and σ enable us to estimate the asymptotic bias 650
locally on a brain region: 651 (14) Bˆ∞(region) = ˆd(region) ˆ σ(region) !−2 , 652
We emphasize here that ˆB∞is an estimate of the asymptotic bias B∞(of the brain template
653
estimation), and not an exact computation. 654
We link the estimate ˆB∞ to the definition of persistence in the Morse-Smale complex
655
framework. First, we define the whitened brain template estimate ˆt of ˆT as: 656
(15) ∀x ∈ Ω, ˆt(x) = T (x)ˆ
ˆ σ(x). 657
In other words, we divide the brain template intensity of each voxel x by the estimation of the 658
standard deviation of the noise at this voxel ˆσ(x). This whitens the noise all over the brain 659
template. 660
We assume that the critical points of ˆt are close to the critical points of ˆT and consider the 661
Morse-Smale complex of the whitened template. We further assume that: ˆσregion ' ˆσ(max) '
662 ˆ
σ(min), where ˆσ(max), ˆσ(min) are the variabilities at the respective min and max of the 663
Morse-Smale complex. We can write: 664 (16) Bˆ∞(region) ' ˆ Tmax ˆ σ(max)− ˆ Tmin ˆ σ(min) !−2 = ˆt(max) − ˆt(min)−2 = pˆt(region) −2, 665
where we recognize the persistence pˆt(region) of the corresponding region of the whitened
666
template ˆt. This links the estimation of the asymptotic bias to the persistence of the whitened 667
template’s Morse-Smale complex. This shows how a topological property of the image in fact 668
represents a statistical property of this image as the estimate of the brain template. 669
Hierarchy of the whitened template. The persistence of the whitened template quantifies 670
locally the asymptotic bias, i.e. how far the brain template is from the unique brain anatomy 671
of the generative model. Is there a statistical interpretation of the hierarchies of Morse-Smale 672
complexes, introduced in Section 2? Let us consider another Morse-Smale of the whitened 673
template’s hierarchy, i.e. a Morse-Smale computed at a given persistence threshold pthreshold.
674
There is an asymptotic bias threshold that corresponds, which we can write: p−1/2threshold. The 675
regions kept in the new Morse-Smale are those having a persistence higher than the persistence 676
threshold pthreshold, i.e. those having an asymptotic bias lower than the asymptotic bias
677
threshold p−1/2threshold. 678
Therefore, if we can impose the topology of the brain template to match the new Morse-679
Smale of threshold p−1/2threshold, we control its asymptotic bias. This means that we preserve 680
only the min-max pairs shown on the Morse-Smale graph chosen. It eliminates the min-max 681
pairs that have been created by chance, because the noise on the images was at a similar level 682
than the intensity signal on these regions. The next subsection explains how to impose the 683
topology of a given Morse-Smale on the template’s image. 684
3.2. Controlling the template’s asymptotic bias by constraining its topology. We are 685
given the template’s image and we want to force its asymptotic bias to be below a threshold, 686
so that it is closer to estimating the anatomy of the database, i.e. the anatomy shared by 687
the subject brains. The development above suggests to compute the Morse-Smale complex 688
with a persistence threshold corresponding to the desired bias threshold. Then, enforcing 689
template’s topology to match the Morse-Smale complex will control its asymptotic bias. This 690
enforcement procedure is called “Topological denoising”. 691
3.2.1. Topological denoising. Topological denoising is a procedure for smoothing an im-692
age, like our template image, while preserving topological features [23,18]. The input of the 693
procedure is the intensity function defining the template ˆT : Ω → R and a MS complex with 694
intensity values at its nodes. Enforcing the template’s topology to match the MS complex 695
means that we compute ˆT0 : Ω → R which is a smoothed version of original template estimate 696
ˆ
T containing only the intensity min-max pairs specified by the MS complex chosen. ˆT0 should 697
be otherwise as close as possible to the original template estimate ˆT in terms of intensity. The 698
values and positions of the MS extrema are preserved, while all other extrema are removed 699
from the brain template estimate. Such procedure provides control over the topology of the 700
brain image ˆT . 701
Formally, the original Topological denoising problem is written as the minimization [23]: 702 argminT0 X xi∈Ω || ˆT (xi) − T0(xi)||2+ Z Ω ||∆T0||2 703
s.t. T0(xi) = ˆT (xi) for xi a node of the MS complex
704
T0(xj) > T0(xi) for xj neighbor of xi and xi minimum
705
T0(xj) < T0(xi) for xj neighbor of xi and xi maximum
706
T0(xi) > min neighbor xj
T0(xj) for xi not an extremum
707
T0(xi) < max neighbor xj
T0(xj) for xi not an extremum.
708 709
The non linear inequality constraints make this optimization problem hard to solve. The 710
solution suggested by [23] is to compute a representative function u that verifies the last four 711
inequality constraints. Given this function u, the topological denoising problem becomes: 712 argmin T0 X xi∈Ω || ˆT (xi) − T0(xi)||2+ Z Ω ||∆T0||2 713
s.t. T0(xi) = ˆT (xi) for xi a node of the MS complex
714
(T0(xi) − T0(xj))(u(xi) − u(xj)) > 0, for (xi, xj) a pair of neighbors,
715 716
where the last constraint means that the direction of T0 shall be aligned with the direction of 717
u. This alternative optimization problem is easily solved [23]. 718
The representative function u can be computed by solving the Dirichlet problem: 719 argmin u Z Ω ||∇u||2 720
s.t. u(xi) = 0 for xi a minimum
721
u(xi) = 1 for xi a maximum.
722 723
Minimizers of the Dirichlet energy are harmonic functions, and their properties guarantee 724
that xi and xj are minima and maxima and that u contains no other extrema inside the MS
725
regions. We refer to [23] for further details. 726
Figure 10shows examples of topological denoising. The topology to be enforced is repre-727
sented by the red and blue dots, which are nodes of the MS complex: red for intensity maxima 728
and blue for intensity minima. On the left example, the circle motifs that were inducing un-729
desirable minima and maxima are removed. On the right example, two of the initial four 730
maxima in the center of the image are removed too. Only the topology dictated by the input 731
Morse-Smale complex is preserved. 732
3.2.2. Integrating the topological denoising in the template computation pipeline.
733
The original template’s computation is performed with the algorithm of [24] and use the LCC 734
Figure 10. Topological denoising on two toy examples. We impose topological constraints on the initial images, on the left in both cases: minima in blue and maxima in red. The arrows denote the action of the topological denoising and point to the output image.
Log-demons for the registrations [29]. We adapt it by adding a Topological denoising step, in 735
order to control the template’s asymptotic bias. 736
Algorithm 1 shows the adapted procedure. One initiates with the template being one of 737
the subject images: ˆT1 = I1. At each iteration k of the template’s computation, one registers
738
the subject images to the current template ˆTk and performs the average of the registered
739
images’ intensities to get a first version of the updated template ˆTk+1. So far, this matches
740
the usual template estimation procedure. Our adaptation is what follows. The MS complex 741
of the updated template ˆTk+1 is computed, using the R package msr [16]. Then, the updated
742
template ˆTk+1is smoothed using Topological denoising, see Figure12. These steps are iterated
743
until convergence. 744
Algorithm 1 Controlled brain template estimation
Input: Images {Ii}ni=1, noise variance σ2, persistence threshold pthreshold
Initialization: ˆ
T1= I1 (one of the subjects images)
k = 1 Repeat:
Non-linearly register the images to ˆTk, i.e. compute φik: Jki ' Ii◦ φik
Compute the mean deformation: ¯φk
Register subject image: Lik= Ii◦ φi k◦ ¯φ
−1 k
Compute the mean intensity image for template iteration: ˆTk+1 = n1
Pn
i=1Lik
Compute the MS complex of ˆTk+1 at persistence level p
Topological denoising of Tk+1 using the MS complex
k ← k + 1
until convergence: || ˆTk− ˆTk+1|| <
Output: ˆTk
The main parameter controlling this adapted procedure is the asymptotic bias threshold, 745
i.e. the persistence threshold pthreshold for the MS complex computation. The next section
746
discusses the choice of this parameter pthreshold. Varying the threshold pthreshold leads to the
747
construction of a hierarchy of templates. The other parameter is σ, which is the noise on the 748