Topologically constrained template estimation via Morse-Smale complexes controls its statistical consistency

(1)

HAL Id: hal-01655366

https://hal.inria.fr/hal-01655366

Submitted on 4 Dec 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of

sci-L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents

Topologically constrained template estimation via

Morse-Smale complexes controls its statistical

consistency

Nina Miolane, Susan Holmes, Xavier Pennec

To cite this version:

Nina Miolane, Susan Holmes, Xavier Pennec. Topologically constrained template estimation via Morse-Smale complexes controls its statistical consistency. SIAM Journal on Applied Al-gebra and Geometry, Society for Industrial and Applied Mathematics 2018, 2 (2), pp.348-375. �10.1137/17M1129222�. �hal-01655366�

(2)

controls its statistical consistency 2 ***** 3 4 Abstract. 5

In most neuroimaging studies, one builds a brain template that serves as a reference anatomy for normalizing

6

the measurements of each individual subject into a common space. Such a template should be representative

7

of the population under study, in order to avoid biasing the subsequent statistical analyses. The template is

8

often computed by iteratively registering all images to the current template and then averaging the intensities

9

of the registered images. Geometrically, the procedure can be summarized as follows: it is the computation of

10

the template as the “Fr´echet mean” of the images projected in a quotient space. It has been argued recently

11

that this type of algorithm could actually be asymptotically biased, therefore inconsistent. In other words,

12

even with an infinite number of brain images in the database, the template estimate may not converge to the

13

brain anatomy it is meant to estimate. Our paper investigates this phenomenon. We present a methodology

14

that quantifies spatially the brain template’s asymptotic bias. We identify the main variables controlling the

15

inconsistency. This leads us to investigate the topology of the template’s intensity levels sets, represented by

16

its Morse-Smale complex. We propose a topologically constrained adaptation of the template computation,

17

that constructs a hierarchical template with bounded bias. We apply our method to the analysis of a brain

18

template of 136 T1 weighted MR images from the Open Access Series of Imaging Studies (OASIS) database.

19

Introduction. In neuroimaging, as well as in many other medical image analysis domains, 20

a template is an image representing a reference anatomy. A template is computed from a 21

database of brain images to serve as the brain image “prototype” for further analyses. 22

Computation of a brain template. Various methods exist to compute a brain template [15]. 23

A first practice selects one brain image from the database as the template. If the selected 24

subject’s anatomy is far from the population mean anatomy, the template is necessarily bi-25

ased towards this specific individual. Thus, the template fails at being a prototype of the 26

population. This is why researchers consider the computation of an “unbiased template” that 27

represents the mean anatomy better. 28

Such an “unbiased” template is often constructed by performing an iterative averaging of 29

intensities and deformations [17, 24,20]. One initializes with a template chosen among the 30

subject images. Then, during each iteration, one registers the subjects to the current template, 31

and computes the mean deformation. The new template is computed as the mean intensity 32

of the subjects’ images, deformed with the mean deformation. This procedure does not favor 33

any subject’s image if it does not end in a local minimum. In this sense, the procedure is 34

called “unbiased”. 35

The computed brain template may look blurred or sharp depending on the design chosen 36

for the registration in the above iterative procedure. If the algorithm is designed using linear 37

registration, the template may appear blurred. In contrast, if one uses diffeomorphic regis-38

tration, the template is more likely to look sharp and the sharpness depends on the amount 39

of regularization used [15]. 40

Purpose and desirable properties of the brain template. Computing a template is often the 41

first step in medical image processing because of its many applications. The template is used as 42

a standardized 3D coordinate frame where the subject brains can be compared. The subjects 43

(3)

are then characterized by their spatial diffeomorphic deformations from the template. These 44

deformations may serve for a statistical analysis of the subject shapes [6]. One studies the 45

normal and pathological variations of the subjects with respect to the template. The deforma-46

tions also facilitate automated segmentation, by mapping the template’s already segmented 47

regions into each subject space. 48

What are the desirable properties of the brain template, with respect to the applications 49

mentioned above? First, the template should be representative of the population, removing 50

any bias toward a specific subject during the analysis [10,5,7]. Second, the template should 51

be sharply defined, so that subtle anatomical structures can be easily observed or segmented. 52

The brain template, an inconsistent estimator of the unique brain anatomy. However, the two 53

desirable properties cannot be fulfilled simultaneously: there exists a trade-off, akin to the 54

standard bias-variance trade-off in statistical learning. 55

First, this trade-off can be understood intuitively. Consider a database of brain images 56

divided into two groups that have different topologies. The first group has subjects with three 57

sulci - i.e. depressions or grooves in the cerebral cortex - in a specified brain region. The 58

second group has subjects with only two sulci in the same region. A sharply defined template 59

has to decide on a specific topology in this brain region, i.e. whether it shows two or three 60

sulci. Therefore, it might not estimate correctly the brain anatomy of this population, which 61

might be problematic for the following applications in neuroimaging. For example during 62

a statistical analysis, registering the subjects with the three sulci topology to a template 63

that has chosen a two sulci topology might not be reasonable [7]. A sharp template is only 64

meaningful if the anatomical structures shown are representative of the whole population, i.e. 65

if the population is homegeneous. 66

Second, the trade-off has been emphasized in recent studies investigating the bias of the 67

template as an estimator of the population’s shared anatomy. In the classical approach [2], an 68

initial assumption states that there is a unique (brain) anatomy shared by the population. The 69

subjects are then modeled through a generative model as random deformations of the unique 70

brain anatomy, observed with additional noise. The unique brain anatomy is a parameter of 71

this model. The template computation is interpreted as its estimation. One can ask about 72

its asymptotic bias: does the template converge to the unique brain anatomy for a database 73

with an infinite number of images? 74

This question has been investigated for signals, i.e. 1D images. Some authors prove the 75

asymptotic unbiasedness of the template under the simplifying assumption of no measurement 76

error on the observed signals [27]. Other authors have already provided examples of asymptotic 77

bias, and therefore inconsistency, when there is measurement error [2]. Their experiments 78

show that the template may converge to pure noise when the measurement error on simulated 79

signals increases. A bias is shown to occur in [8] for curves estimated from a finite number of 80

points in the presence of noise. 81

Recently, an asymptotic bias has been shown in the setting of Lie group actions [35,34]. 82

Our argument, shown in an abstract geometric context in [35,34] but adapted here to brain 83

images, is as follows. We look at the subspace defined by all brains with the same shape 84

as the unique brain anatomy. We show that the curvature of this space, at the scale of the 85

measurement noise, introduces a bias on the brain template. 86

(4)

Using topology to investigate the brain template’s asymptotic bias. We want to link: (i) 87

the fact that a population with two groups of different brain topologies cannot be accurately 88

represented by a sharp template, with (ii) the mathematical results on the template’s bias 89

as an estimate of the anatomy shared by the population. The framework of [35] is based on 90

the quotient of the space of the data space by the action of a Lie group. The data in our 91

case are the brain images, and the Lie group action is the action of diffeomorphisms on these 92

images. Quotienting the images by the action of diffeomorphisms amounts to filtering out 93

any information that is invariant by diffeomorphic deformations. Thus, the quotient gives the 94

topology of the images’ level sets. Using the intuition of [35], we could quantify the brain 95

template’s asymptotic bias using a representation of its topology. 96

Quantifying the bias could enable us to decide when and where a sharply defined template 97

makes sense. We could want a sharp template in brain regions where the intersubject anatom-98

ical variability is low and a fuzzier template when this variability is higher. Alternatively, we 99

could consider computing several sharp brain templates using mixtures. This discussion boils 100

down to the question: when is it reasonable to assume that a unique brain anatomy represents 101

the whole subject population? 102

Furthermore, we could think about controlling the brain template’s asymptotic bias by 103

constraining its topology. Topological representations of images have been used with various 104

objectives in the literature. For example, [12] uses a topological representation of brain images 105

for classification of autism versus normals. Topological constraints have also been implemented 106

for segmentation where the reconstruction of the cortical surface needs to match the brain 107

anatomy [30,31,21]. However, topological representation of images or topological constraints 108

on images have not been used to study and enforce a statistical property, like asymptotic 109

unbiasedness. 110

Contributions and Outline. We use a topological representation of images - the Morse-111

Smale complex - to investigate and control the asymptotic bias of the brain template. We 112

make three main contributions in this paper. First, we show how to combine geometry and 113

topology to tackle a statistical problem in neuroimaging. We provide conjectures at the 114

boundaries of the fields with sketches of their proofs. Second, we analyze the template as an 115

estimator of the brain anatomy and quantify the asymptotic bias. This leads us to discuss 116

the initial assumption of a unique anatomy. Third, we present an adaptation of the template 117

computation algorithm that bounds the bias, through topological constraints, at the price of 118

constructing a “smoother” template. 119

Section 1 presents the geometry and the topology of the template computation. We 120

emphasize the variables that describe the bias of the brain template. Section 2 presents 121

the chosen computational representation of these variables through Morse-Smale complices. 122

Section3 leverages the previous computational model to spatially identify the biased regions 123

of the template. We thus propose an adaptation of the template computation with topological 124

constraints bounding the bias. In Section4our methodology is used on the Open Access Series 125

of Imaging Studies (OASIS) database of T1-weighted MR brain images. 126

1. Geometry and topology for template estimation . This paper will not present mathe-127

matical results, rather we show how geometry and topology combine to formalize the template 128

computation algorithm and highlight required directions for further mathematical develop-129

(5)

ments. 130

1.1. Geometrization of the action of diffeomorphisms on images.

131

Brain images. We consider two- and three-dimensional images, whose domain Ω ⊂ Rd, with 132

d = 2, 3, is supposed compact. We adopt the point of view of images as square-integrable 133

functions I over the compact domain Ω, i.e. we write I ∈ L2(Ω), where L2(Ω) is a Hilbert

134

space. The corresponding L2 distance is invariant by volume preserving diffeomorphisms

135

(volumorphisms). Additionally, we assume that the images are in C∞(Ω), which is C∞defined 136

on the compact support Ω. We denote Img(Ω) the set of images. 137

To illustrate the following concepts, we use the toy Hilbert space R2 _{where one point}

138

schematically represents one image, see Figure1. 139

Diffeomorphisms. A diffeomorphism of Ω is a differentiable map φ : Ω → Ω which is a 140

bijection whose inverse φ(−1) _{is also differentiable. We consider two sets of diffeomorphisms.}

141

On the one hand, we consider C∞(Ω), i.e. the smooth diffeomorphisms that are the 142

identity outside a compact support. C∞(Ω) can be seen as an infinite dimensional manifold 143

[33] and forms an infinite dimensional Lie group [26]. Its Lie algebra V is the set of smooth 144

vector fields with compact support [26]. We use this set of diffeomorphisms to present algebraic 145

concepts. 146

On the other hand, we consider the set CId(Rd, Rd) defined as CId(Rd, Rd) = {φ = Id +

147

u for u ∈ C_b1(Rd, Rd)}, where the subscript “b” refers to functions that are bounded with 148

bounded derivatives. These diffeomorphisms are “small”, i.e. not too different from the 149

identity. We use this set to formulate mathematical conjectures which need metric properties. 150

If the specification between the two sets is not needed, we will refer to Diff(Ω) to denote 151

diffeomorphisms. 152

Action of diffeomorphisms on (brain) images. The Lie group of diffeomorphisms Diff(Ω) acts 153

on the space of images L2(Ω) [39]: ρ : Diff(Ω) × Img(Ω) → Img(Ω), (φ, I) 7→ φ · I = I ◦ φ−1.

154

This action is represented on Figure 1 (a), which shows an image I and its diffeomorphic 155

deformation. 156

Intuition and schematic representation on R2. The statistical analysis of this paper relies on 157

geometric considerations in the Hilbert space of images L2(Ω) endowed with the action of an

158

Lie group of diffeomorphisms Diff(Ω). The intuition on the geometry in these abstract spaces 159

will be given by a serie of figures, where: 160

L2(Ω): Hilbert space of images is represented as R2: 2D surface of the paper, 161

Diff(Ω): Lie group of diffeomorphisms is represented as SO(2): Lie group of 2D rotations. 162

163

and we consider the fundamental action of SO(2) on R2. 164

Figure1(b) shows this representation. The action of a diffeomorphism φ on I is represented 165

by the blue curved arrow, i.e. by the action of a 2D rotation. The action transforms the image 166

I into another image φ · I, i.e. into a different point in the Hilbert space. 167

We note that L2(Ω) with the action of Diff(Ω) and R2 with the action of SO(2) have

168

different properties. For example, R2 _{and SO(2) are finite dimensional and the action of}

169

SO(2) is isometric with respect to the Euclidean distance on R2. In comparison, L2(Ω) and

170

Diff(Ω) are infinite dimensional and the action of Diff(Ω) is not isometric with respect to the 171

(6)

L2 distance between images of L2(Ω).

172

Nevertheless, we argue that our schematic representation makes sense for the following 173

reasons. First, representing infinite dimensional spaces by finite dimensional ones is standard 174

as it is difficult to draw infinite dimensions on a piece of paper. Second, we consider diffeo-175

morphisms that transform a brain image into another brain image, i.e. an image into one that 176

looks similar. Therefore, we restrict ourselves to ”small” diffeomorphisms. In this context, we 177

will see that we can consider their action as being isometric. 178 Action of the diffeomorphism φ (a) (b) I φ · I I φ · I = I ◦ φ(−1) Action of the diffeomorphism φ

Figure 1. Action of a diffeomorphism φ on a brain image I. (a) the brain image before and after the action of φ, (b) schematic representation of the action of φ on the brain image I, represented as a dot in R2.

Orbit OI of a (brain) image I. Here, we consider Diff(Ω) = C∞(Ω). The orbit OI of a

179

brain image I is defined as all images reachable through the action of diffeomorphisms on I: 180

OI = {I0 ∈ Img(Ω)|∃φ ∈ Diff(Ω) s.t. I0◦ φ−1= I}.

181

On Figure 2 (a), images I1 and I2 belong to the same orbit but I3 belongs to a different

182

orbit. We use R2 representing the space of images, with the action of SO(2) representing the 183

action of diffeomorphisms on Figure2(b). The blue dotted circle on Figure 2(b) represents O 184

the orbit of I1 and I2. This orbit defines a submanifold of images: in this toy illustration, the

185

submanifold is the blue dotted circle. The red point on Figure 2(b) represents OI3, the orbit

186

of the image I3. This orbit contains only one point and is a submanifold of dimension 0.

187

We note that the orbits may be infinite dimensional in the case of diffeomorphisms acting 188

on L2(Ω). Furthermore the orbits are not necessarily high-dimensional spheres as the action of

189

the diffeomorphisms is not isometric. However, by considering only “small” diffeomorphisms 190

acting on a given image I, we move locally on the orbit of image I. We could consider writing 191

a Taylor expansion of the orbit around I [34], where the first order gives its tangent space 192

and the second order is a high-dimensional sphere. Therefore, “small” diffeomorphisms are 193

consistent with a representation of the images’ orbits as spheres. 194

Isotropy group GI of a (brain) image I. Here, we consider Diff(Ω) = C∞(Ω). The isotropy

195

group GI of a brain image I is defined as the subgroup of Diff(Ω) formed by the

diffeomor-196

phisms that leave I unchanged: GI = {φ ∈ Diff(Ω)|I ◦ φ−1 = I}. GI describes the intrinsic

197

symmetry of the image I: the more symmetric is I, the larger its isotropy group. All im-198

ages on the same orbit have conjugate isotropy groups. Moreover, the isotropy group (also 199

called the stabilizer) and the orbit of an image are linked by the orbit-stabilizer theorem: 200

OI ∼ Diff(Ω)/GIin finite dimensions. The intuition is that the larger the isotropy group (and

201

thus, the more symmetry the image has), the smaller the orbit. Figure2(a) shows two brain 202

(7)

images: the isotropy group of I1 and I2 is larger than the isotropy group of I3, in the sense of

203

inclusion. As a consequence, the orbit of I3 is “smaller” than the orbit of I2.

204

We use again the representation of R2 to illustrate this notion. The notion of “isotropy 205

group” dictates the dimension of a given orbit, i.e. whether we have a 1-dimensional subman-206

ifold like the blue dotted circle or a 0-dimensional submanifold like the red point on Figure2

207

(b). The isotropy group of the image at the blue point on Figure2(b) is the identity. Only the 208

identity leaves this image at the same place. The isotropy group of the image represented by 209

the red point is the whole Lie group of 2D rotations. Any rotation leaves this point invariant. 210

Going back to diffeomorphisms, we note that the isotropy group may be of infinite dimen-211

sion: the isotropy group of a uniform image, i.e. I constant map over Ω, is the whole group 212

Diff(Ω). Nevertheless, the orbit-stabilizer theorem holds in infinite dimensions therefore we 213

can rely on the intuition of “smaller” isotropy groups and “larger” orbits. 214 I1 (a) I2 I3 I2 (b) I1 I3

Figure 2. Orbit and isotropy group of a brain image. (a) I1 and I2 belong to the same orbit: I2 is a diffeomorphic deformations of I1. They have conjugate isotropy groups. In contrast, I3 belong to a different orbit and has a different isotropy group. I3shows more symmetry than I1 and I2, thus a larger isotropy group, whereas the more asymmetric details of I1, I2 are the sign of a smaller isotropy group. (b) The orbit of I1, I2 is represented as the blue dotted circle and the two images I1, I2 are points on this circle. The isotropy group is linked to the dimension of the image’s orbit. I1, I2 have a smaller isotropy group, they have a circle orbit. I3 has a larger isotropy group: its orbit is itself, i.e. the red point at (0, 0).

1.2. From geometry to topology.

215

Topology of (brain) images. The topology of a brain image I is defined as the topology of 216

its level sets, these surfaces of Ω with constant intensity. topology refers to properties that are 217

preserved under smooth deformations [16], i.e. conserved by the action of diffeomorphisms on 218

I: for example, the number of holes, or the number of connected parts, see Figure3(a). 219

Geometry and topology combine as follows. Two images I and I0 that are diffeomorphic 220

deformations of each other, i.e. that are on the same orbit, have the same topology. The orbit 221

OI itself represents the topology of image I (and I0). The set of orbits Q = {OI|I ∈ Img(Ω)},

222

which is the quotient space of Img(Ω) by the action of Diff(Ω) is the set of the topologies. 223

(8)

Figure 3(b) shows how the space of images R2 is partitioned into orbits: blue circles and 224

one red “singular circle”, the red point at (0, 0). Figure 3(b) also shows R+, the quotient

225

space of R2 by the group of 2D rotations, which schematically represents the quotient space 226

of the space of brain images Img(Ω) by the Lie group of diffeomorphisms Diff(Ω). Each of the 227

four blue circles in R2 _{becomes a blue point in the quotient space R} +.

228

(b) (a)

Figure 3. (a) Images in the same colomn have same topology, images in the line have different topologies: one cannot be diffeormorphically deformed to match the other. (b) Top: schematic representation of the space of images partitioned into orbits. The two different orbit types are in blue and red respectively. Bottom: schematic representation of the quotient space of brain images Img(Ω) by the Lie group of diffeomorphisms Diff(Ω).

Gathering brain images with similar topologies: orbit types. By definition, two brain images 229

are of the same orbit type if their isotropy groups are conjugate subgroups in the Lie group of 230

diffeomorphisms. In particular, brain images that belong to the same orbit have same orbit 231

type. The type corresponding to the smallest isotropy group - in the sense of the inclusion 232

among the subgroups of the diffeomorphism group - is sometimes called the principal type [1]. 233

We use this appellation in this paper. For example, if an orbit has type the identity of the 234

group of diffeomorphisms, then it is necessarily of principal type. Equivalently, the orbits of 235

principal type are called principal orbits. Other orbits are called singular orbits. 236

The blue circles on Figure3(b) have same orbit type: the images on these orbits have the 237

identity {Id} of the Lie group as isotropy group. They are of principal type since the identity 238

{Id} is the smallest subgroup - in the sense of the inclusion - of the group of rotations. 239

Stratification of the space of topologies. In the space of brain images, we can gather orbits 240

of same orbit type: we gather the blue circles of Figure 3(b) into R2 \ {(0, 0)} on the one 241

hand, and keep the red dot (0, 0) on the other hand. The orbit type itself is a submanifold 242

of the space of brain images: R2_{\ {(0, 0)} or (0, 0) in the schematic brain images space R}2_.

243

Furthermore, these orbit type submanifolds form a stratification, meaning they fit together in 244

a particularly nice way [38]. 245

The quotient space Q is also naturally partitioned into manifolds, and this partitioning is 246

also a stratification. All in all, Q is not a manifold, but Q composed of manifold pieces, and 247

those pieces are called strata. There is a partial ordering of the strata in the quotient space, 248

using the inclusion [22]. 249

Figure 3(b) uses the analogy of R2 as the space of images and SO(2) as the Lie group 250

acting on it. It shows the orbits grouped by orbit type: the color blue denotes one orbit type 251

(9)

and the color red another orbit type. We see for example that Q = R+ is stratified into one

252

stratum being R∗+ - corresponding to the stratum R2\ {(0, 0)} in the space of brain images

-253

and one stratum being {0} - corresponding to the stratum (0, 0) in the space of brain images. 254

1.3. Geometry of generative model and estimation procedure. In this subsection, we 255

consider: Diff(Ω) = C_b1(Rd, Rd). 256

Generative model. The n brain images I1, ..., In are interpreted with a generative

de-257

formable model: Ii = φi· T + i, i = 1...n, where each image Ii ∈ Img(Ω) is a diffeomorphic

258

deformation φi ∈ Diff(Ω) of a unique brain anatomy T , to which noise i is added. The

pa-259

rameter T represents the brain anatomy shared by the population. The transformations φi’s

260

and the noises i’s are i.i.d. realizations of random variables. The transformations φi’s follow

261

a general law, which could be for example a Gaussian law in a finite-dimensional subspace of 262

the Lie group and the i’s represent Gaussian noise on the space of images. We denote σ2 its

263

variance. Definitions of distributions on finite dimensional Riemannian manifolds are taken 264

from [36] and the Gaussian distributions for infinite dimensional spaces from [28]. 265

The model can be interpreted by a three step generative procedure illustrated schemat-266

ically in Figure 4. First, there is only the shared anatomy T . Second, the template T is 267

deformed with the diffeomorphism φi and gives a brain image φi· T . Third, we represent the

268

measurement noise with i, which gives the observed brain image Ii.

269 T T T σ φi φi· T i φi· T + i (a) (b) (c)

Figure 4. Schematic illustration of the generative model of the brain images data. As before, the space of brain images is represented by the plane R2. (a) First step of the generative model: generate a brain anatomy. One usually assumes that there is a unique brain anatomy: T , in green. (b) Second step of the generative model: generate a deformation φi ∈ Diff(Ω) which is used to deform the template. The brain image φi· I belongs to the orbit of T , represented by the green circle. (c) Third step of the generative model: generate noise iin the space of images. The brain image φi· T + i does not belong to the orbit of T anymore.

Computing the template: an estimation procedure. Computing the brain template amounts 270

to invert the generative model: given the data, we want to estimate the parameter T . The 271

transformations φ’s are hidden variables of the model. The natural statistical procedure to 272

estimate T in this context is the Expectation-Maximization (EM) algorithm [13]. The EM is 273

an iterative procedure that maximizes the log-likelihood of the generative model with hidden 274

variables. As such, the EM gives an asymptotically unbiased and consistent estimation of the 275

(10)

brain anatomy T . 276

In practice, one does not use the EM algorithm because it is computationally expensive, 277

especially when dealing with tridimensional images. Most neuroimaging pipelines rely on an 278

approximation of the Expectation-Maximization algorithm to estimate the brain anatomy, 279

called “Fast Approximation with Modes” in [2]. It runs as follows. Initialize the estimate 280

with ˆT = I1, i.e. one of the brain images from the database. Then, iterate the following two

281

steps until convergence [24]: 282

(1) φˆi = argmin φ∈Diff(Ω)

d_Img(Ω)( ˆT , φ · Ii) + λReg(φ), ∀i ∈ {1, ..., n},

283 (2) T = argminˆ T ∈Img(Ω) n X i=1 dImg(Ω)(T, ˆφi· Ii)2. 284 285

Step (1) is an estimation ˆφi of the diffeomorphisms φi, and an approximation of the E-step

286

of the EM algorithm. In practice, each brain image Ii is registered to the current template

287

estimate and the ˆφi is the result of this registration. The term Reg is a regularization that

288

ensures that the optimization has a solution. Showing that the estimates ˆφi’s and ˆT exist is

289

technical and beyond the scope of this paper. 290

Step (2) is the M-step of the EM algorithm: the maximization of the surrogate in the 291

M-step amounts to the maximization of the variance of the projected data. This computes 292

the updated template estimate, as the mean intensity of the subjects images Ii, deformed with

293

the mean deformation of the ˆφi’s.

294

The registration step (1) amounts to aligning the n subject images by transporting them 295

onto their orbit (see Figure 5(b)), i.e. projecting them in the quotient space (see Figure 5

296

(c)). Step (2) averages the n registered images (see Figure5 (d)). 297

(a) (b) (c) (d)

Bn?

Figure 5. Geometrization of an iteration of the template’s computation: (a) n subjects images (black squares); (b) the n images are registered, they travel on their orbit (the blue circles) to get aligned; (c) registered images; (d) the empirical brain template ˆT (in yellow) is computed as the Fr´echet mean of the n registered images. How far is it from the unique anatomy T of the generative model (in green): can we quantify Bn?

Evaluation of the procedure: definition of asymptotic bias B∞. We evaluate the template

298 ˆ

T as an estimator of the unique brain anatomy T (see Figure 5(d)) given n observations Ii,

299

i = 1...n. We note that in other papers, T may be called the template directly and ˆT the 300

template’s estimate. 301

(11)

We consider two measures of the accuracy of this estimator: its variance V_n2 and bias Bn,

302

which are defined as: 303

V_n2= E(( ˆT − E( ˆT ))2) and Bn= E( ˆT − T ).

304 305

The bias for n images Bn is illustrated on Figure 5. We are interested in the asymptotic

306

behavior n → +∞, i.e. when the number of brain images goes to infinity. One would expect 307

that the procedure converges to the brain anatomy T it is designed to estimate. When the 308

estimator converges, its variance is asymptotically zero: V∞2 = 0.

309

1.4. Geometry of the template estimator’s evaluation.

310

Estimation procedure interpreted as the Fr´echet mean in the quotient space. We consider 311

the estimation procedure of Equation1.3. First, we note that the regularization term in step 312

(1) forces the diffeomorphisms to be small, i.e. to be close to the identity and close to have 313

an isometric action on the images. Second, we could consider the group of volumorphisms, 314

the subgroup of diffeomorphisms that leave a volume form V invariant and is used to model 315

incompressible fluids in continuum mechanics. In this case, using y = φ−1(x) and the fact 316

that the volume form is conserved: 317 (1) d(φ·I1, φ·I2)2 = Z Ω (I1◦φ−1(x)−I2◦φ−1(x))2dV (x) = Z Ω

(I1(y)−I2(y))2dV (y) = d(I1, I2)2

318

and the action is thus isometric. 319

In both of the above two frameworks, it is thus reasonable to trade the regularization term 320

for the assumption that the action is isometric in our modelization. 321

First, we model the estimation procedure as follows. 322 (1) φˆi= argmin φ∈G dM( ˆT , φ · Ii), ∀i ∈ {1, ..., n}, 323 (2) T = argminˆ T ∈M n X i=1 dM(T, ˆφi· Ii)2. 324 325

where M is a generic Riemannian manifold and G a Lie group acting on M isometrically. 326

This converges to a local minimum because it decreases at each step a cost bounded below 327

by zero. The estimator computed with this procedure is: 328 (2) T = argminˆ T ∈M n X i=1 min φ∈Gd 2 M(T, φ · Ii). 329

The term min

φ∈Gd 2

M(T, φ · Ii) is the distance in the quotient space between T and Ii. Thus

330

Equation 2defines the Fr´echet mean on the quotient space [36]. 331

The study of the set of solutions of algorithms in Equation 1.3 and Equation 1.4 is an 332

interesting direction of research but beyond the scope of this paper. However, we point out 333

that the existence and uniqueness of the solution of algorithm 1.4 is linked to the question 334

of whether the Fr´echet mean exists and is unique in the quotient space, which is studied for 335

example in [25,4]. 336

(12)

Asymptotic bias B∞ and curvature. We show in [35] that the asymptotic bias is non zero:

337

B∞ 6= 0. For an infinite number of brain images n → +∞, the estimate converges, but not

338

to the brain anatomy T it was designed for. We compute in [35] a Taylor expansion of the 339

asymptotic bias B∞ around the noise σ = 0, in the case of a finite dimensional manifold and

340

isometric Lie group action [34]: 341 (3) B∞= σ2 2 H(T ) + O(σ 3_{) + (σ)} 342

where H(T ) denotes the mean curvature vector of the template’s orbit. There is no bias 343

when there is no measurement error σ = 0. It was observed experimentally that the bias was 344

dependent on the measurement error [2]. 345

The coefficient H(T ) depends on the template T that is being estimated. We investigate 346

this dependency in the geometric framework of Subsection 1.2. Assume that there exists a 347

fixed point o of the Lie group action, i.e. a point that is invariant by the whole Lie group. 348

Consider the orbit OT of T . As the action is isometric, the orbit belongs to a geodesic

349

sphere Sd with center o and radius d. A geodesic sphere of radius d in a manifold - like a

350

hypersphere of radius d in Rm - has mean curvature vector of norm: ||H(T )|| = (m−1)_d . If 351

we write the template’s bias in the units of d, then the asymptotic bias depends on σ_d2. 352

In other words, the distance of the template to the singularity o, at the scale of the noise σ 353

governs the asymptotic bias B∞. The approach of this paper is to compute the asymptotic

354

bias by estimating the geometric parameters. Conversely, one could compute the geometric 355

parameters, like the external curvature of orbits for different Lie groups, from the asymptotic 356

bias known on simulated examples. 357

Figure6shows the intuition behind this. On Figure 6(a), R2 schematically represents the 358

space of brain images - the black squares represent brain images from the database, the green 359

square is T - and the green circle is the orbit of T . The dotted circles, that have their centers 360

on the template’s orbit, represent the probability distribution of the (2D isotropic) Gaussian 361

noise in the generative model. More precisely, they represent the level set at σ of the noise 362

distribution. The curvature H(T ) controls the area in grey on Figure 6, which is the area 363

inside the Gaussian level set that is outside T ’s orbit. This area is greater that the area inside 364

T ’s orbit. As a consequence, the probability that the brain images are generated “outside” 365

T ’s orbit is higher than the probability that they are generated inside T ’s orbit. 366

Figure 6(b) shows the registration step of the template estimation: there is a higher 367

probability that the registered images are away from T , as if repulsed from the singularity 368

around which the orbits warp. When one averages the registered images, one sees that the 369

template’s estimate becomes biased as it will systematically give an image that is further away 370

than T from the quotient space’s singularity, i.e. from the red dot. 371

Quantifying the asymptotic bias B∞of the brain template. In neuroimaging, the manifold is

372

the space of brain images Img(Ω) and the Lie group is the group of diffeomorphisms Diff(Ω), 373

both infinite dimensional. This paper assumes that we can apply the geometry of [34], because 374

there are indications that B∞ appears in the same fashion in infinite dimension, see Figure7.

375

First, [34] studies the bias when the dimension of the manifold increases. They consider 376

the finite dimensional manifold M = Rm with the action of SO(m), i.e. a generalization of 377

(13)

d σ

(a) (b)

Figure 6. Schematic illustration of the asymptotic bias in the template computation algorithm of neu-roimaging. (a) The images generated with the model described above have a higher probability to be“outside” (with respect to the curvature) of the template’s orbit. (b) The registration during template’s estimation aligns the images. The distribution of images is unbalanced with respect to the real template.

Figure 7. Image from [34]. Take M = Rm with the action of SO(m). The template’s bias increases with σ and is more important as m increases: the blue curve shows the asymptotic bias for m = 2, the pink curve for m = 10 and the yellow curve for m = 20.

the toy example R2 with the action of SO(2) from our illustrations. We show in [34] that B∞

378

increases when m increases, see Figure7. 379

Then, the work of [3] shows that there exists an asymptotic bias in an infinite dimensional 380

Hilbert space. The work of [14] provided an asymptotic behavior of the bias when the noise 381

level σ tends to infinity. This bias is exemplified on examples of template of signals in [14], 382

(14)

Figure 8. Image from [14]. The real signal is shown in blue. The template, computed with n = 105 observations simulated with Gaussian noise with σ = 10, is shown in red. There is an asymptotic bias in the estimation of signal.

see Figure 8. Although the 1D signals are discretized for the numeric implementations, they 383

represent 1D functions that are elements of an infinite dimensional Hilbert space. 384

Ultimately, [9] gives a lower bound of the asymptotic bias for shapes of curves in 2D, 385

where terms depend on derivatives of the functions representing the curve. These derivatives 386

can be interpreted as the derivative of the action of translations, which leads to the curvature 387

of the orbit of the given function under the translations’ action. As a consequence of these 388

examples, we assume that the intuition provided by [34] applies to neuroimaging and that the 389

bias depends on the ratio σ_d2 . 390

2. Computational representation of geometry and topology . Now that we have seen 391

that the geometrization of template estimation leads us to investigate the topology of images, 392

we show in this section how the Morse-Smale complices can be used to encode the topology 393

and help the estimation of the geometric parameters d and σ previously introduced. 394

2.1. Definition of Morse-Smale complexes for (brain) images.

395

Morse-Smale (intensity) functions. A real-valued smooth map I : Ω → R is a Morse function 396

if all its critical points are non-degenerate (the Hessian matrix is non-singular) and no two 397

critical points have the same function value. The intensity function I representing a bi- or 398

tri-dimensional brain image is a Morse function, at least after a convolution with a smoothing 399

(15)

Imin Imax (i) (ii) p1 p2 (a) (b) _(c)

Figure 9. (a) Intensity I on a 2D brain image visualized as height: maxima are in red, minima in blue. This is a Morse-Smale function I : Ω = R2

→ R. (b) Persistences p1< p2 of two pairs min-max. A threshold p1< p < p2 divides the domain into 3 regions (pink, violet, green), while p < p1 divides it in 5 regions (pink, violet, brown, turquoise, green). (c) Computational representation of the geometry: Regions on a 2D domain, induced by a MSC with threshold p = 0.1: (i) The MS graph represents the isotropy group’s class of the image, (ii) the labeled MS graph represents the image’s orbit under the diffeomorphisms

Gaussian [11]. In the following, I represents a brain image. Morse theory traditionally 400

analyzes the topology of a manifold by studying the Morse functions on that manifold. Here, 401

the manifold is known: it is the image domain Ω. We are not interested in the topology of Ω 402

but rather in the topology of the functions I themselves, that is: we would like to know the 403

distribution of their critical points. Figure9(a) shows a 2D slice of a 3D brain image I, where 404

the intensity is represented as the height, to better emphasize its maxima and minima: the 405

maxima are in red and the minima in blue. 406

We introduce the notions of integral lines, ascending and descending manifolds that are 407

needed to define Morse-Smale (intensity) functions. An integral line is a maximal path in the 408

image domain Ω whose tangent vector correspond to the intensity gradient ∇I, the gradient 409

of I, at every point. This notion comes from autonomous ordinary differential equation, where 410

it represents the trajectory of a system verifying: 411

(4) dx

dt(x) = ∇I(x) 412

Each integral line starts and ends at critical points of I, where the gradient ∇I is zero. 413

Ascending A(xi) and descending D(xj) manifolds of respective extrema xi and xj are defined

414 as: 415

A(xi) = {x ∈ Ω | The integral line going through x ends at xi}

416

D(xj) = {x ∈ Ω | The integral line going through x starts at xj}

417 418

Take the two manifolds A(xi) and D(xj) in Ω and assume they intersect at a point p ∈ Ω.

419

Let TA(resp. TD) denotes the set of all vectors tangent to A(xi) (resp. D(xj)) at p. If every

420

vector in Ω is the sum of a vector in TA and a vector in TD, then A(xi) and D(xj) are said

421

to intersect transversely at the point p. The intensity function I defining the brain image 422

is Morse-Smale if the ascending and descending manifolds only intersect transversely. We 423

assume in the following that all brain images I are Morse-Smale. 424

(16)

Morse-Smale complex and persistence. The Morse-Smale complex of a Morse-Smale func-425

tion is the set of intersections A(xi) ∩ D(xj), over all combinations of extrema (xi, xj) [19].

426

The Morse-Smale complex includes regions (i.e., sub-manifolds of Ω) of dimensions 0 through 427

D, where D is the dimension of the domain Ω, i.e. D = 2 or D = 3 for our purposes. The 428

Morse-Smale (MS) complex of I is a partition of the domain Ω into regions defined by the set 429

of integral lines that share common starting and ending points. The interior of each region 430

is monotonic with respect to the intensity I: a region contains no critical points and has a 431

single local minimum and maximum on its boundary, see for example Figure 9(c)(ii) where 432

the maximum Imax and minimum Imin are shown on the boundary of the grey region. The

433

MS complex can also be seen as a graph on the brain image domain Ω whose nodes are the 434

critical points of the brain image intensity. 435

The persistence of a critical point xi of I is the amount of change in intensity I required

436

to remove this critical point: 437

(5) p(xi) = |I(xi) − I(n(xi))|

438

where n(xi) is the critical point closest to xi in intensity, among the critical points connected

439

to xi by an integral line [16]. The persistence of xi is a measure of its significance as a critical

440

point, i.e. importance of the topological feature. Figure 9(b) illustrates the definition of 441

persistence on a 1D example. The function represented has 4 critical points: two minima and 442

two maxima. The figure shows how they pair, as well as their persistence. On the x-axis, 443

colors show the regions of the corresponding 1D Morse-Smale complex. 444

Beside this usual definition of persistence of a critical point, we define here the persistence 445

of a region of the Morse-Smale complex as the amount of change in intensity required to 446

remove this region from the MS complex, and more precisely: 447

(6) p(region) = |Imax− Imin|

448

where Imax and Imin are respectively the maximum and the minimum in intensity of this

449

region. In contrast to the definition of the persistence of a critical point, we do not rely on 450

the saddle points, but only on the extrema i.e. the minima and maxima. 451

Hierarchy of Morse-Smale complexes. The notion of persistence of a region enables the 452

definition of a hierarchy of MS complexes of one brain image I [19,16]. One uses the ordering 453

given by persistence to successively remove topological features from the image I. One starts 454

with the MS complex of the brain image I defined above and one recursively removes the 455

critical points with minimal persistence. This leads to a nested series of successively simplified 456

Morse-Smale complexes. At each level, some of the MS regions are merged into a single region. 457

Ultimately the Morse-Smale complex consists of only one region which is the entire domain 458

Ω. 459

The persistence introduces a notion of scale at which the Morse-Smale complex of I is 460

considered. One keeps only the nodes whose persistence is above the threshold. Figure 9(b) 461

shows that the one dimensional domain is partitioned differently if one takes a threshold p 462

below p1 or between p1 and p2. We say that a Morse-Smale complex is represented at a

463

given persistence level. At the scale of the persistence threshold p, the intensity is considered 464

monotonic on each region of the MS. 465

(17)

We note that this MS hierarchy is different from a Gaussian scale space (GSS) hierarchy of 466

images [37]. The latter takes critical points across smoothing scales and not across persistence 467

levels. 468

2.2. Computing Morse-Smale complexes of (brain) images in practice. The previous 469

definitions are relevant to (continuous) Morse-Smale theory and apply strictly for a continuous 470

intensity function I. Nevertheless, the MS complex, introduced in terms of ascending and 471

descending manifolds, can be computed for discrete brain images as follows [16]. We choose 472

the approach of line integrals to compute the Morse-Smale complex. Other approaches may 473

also be considered like the Delaunay triangulation [12]. 474

Computing the Morse-Smale of a brain image. We compute the Morse-Smale complex of a 475

brain image, which will later be the brain template image. Our input are {xi, Ii}, i.e. the

in-476

tensity values {Ii} on a grid {xi} of Ω. We compute the integral lines of the intensity gradient,

477

which we then gather to get the regions of the Morse-Smale complex. For each element of the 478

grid xi, following the gradient ∇I leads to computing the integral line going through xi and

479

in particular its starting and ending points [16]. The domain Ω can be approximated via a 480

k nearest-neighbor graph and one computes the integral lines by considering the connectivity 481

of the graph. Then, elements xi’s with same starting and ending points belong to the same

482

Morse-Smale region. This gives the partition of the domain Ω and therefore the Morse-Smale 483

complex. We remark that xi’s necessarily belong to a 3-dimensional (for a tridimensional

im-484

age) component of the Morse-Smale complex because the 0-, 1- and 2-dimensional components 485

have measure zero. 486

Figure 9(c) shows the Morse-Smale complex of the 2D slice of a 3D brain image for level 487

of persistence of p = 0.1. The image’s 2D domain is divided in different regions, represented 488

by the different colors. The quadrant shows part of the underlying Morse-Smale graph. The 489

red dot represents a maximum in intensity, and the blue dot a minimum of intensity. They 490

are nodes of the underlying graph on the domain Ω. 491

Morse-Smale (MS) graph and labeled Morse-Smale (MS) graph. There are two ways of 492

representing the Morse-Smale graph corresponding to the computed Morse-Smale complex. 493

Both will be useful for analyzing the template’s asymptotic bias. One can consider the graph 494

as the set of nodes and edges, without any intensity information at the nodes. We simply 495

call this graph the Morse-Smale (MS) graph: this is the graph illustrated on Figure 9(c)(i). 496

Alternatively, one can label the nodes with the intensity information. We call this graph the 497

labeled Morse-Smale (MS) graph: this is the graph illustrated on Figure9(c)(ii). Both of these 498

graphs are oriented, the edges being directed in the direction of the intensity gradient from a 499

node to the next. 500

2.3. Template’s computation and Morse-Smale complexes. We show how the MS com-501

plex of an image can represent its geometry and in particular isotropy group. 502

Lie algebra of the isotropy group and intensity gradient of the brain template. The template 503

is an image I ∈ Img(Ω). We consider the Lie group of diffeomorphisms Diff(Ω) = C∞(Ω) and 504

its Lie algebra V , see Subsection1.1. 505

Lemma 1. Take > 0 and consider the Lie algebra: 506

(7) VI = {v ∈ V s.t.: I ◦ Exp(tv) = I, ∀|t| < }

(18)

By construction, the exponential of the elements of VI are in the isotropy group of I. For 508 v ∈ VI, we have: 509 (8) ∀x ∈ Ω, ∇I(x)T.v(x) = 0 510

Proof. Take v ∈ VI and |t| < . Its group exponential is a diffeomorphisms in the isotropy

511

group GI, which can be written:

512

φ = exp(tv) = Id + tv + O(t2) 513

514

Then, the equation above leads to: 515

I(x) = I(x − tv(x) + O(t2)) 516

I(x) = I(x) − DI(x). (tv(x)) + O(t2) 517

0 = DI(x). (tv(x)) + O(t2) 518

519

The identification of the coefficients in this Taylor expansion leads to: 520

∇I(x).v(x) = 0 521

522

A vector field of VI, which is in the Lie algebra of the isotropy group of the image I, is

523

perpendicular the image’s gradient at any point x of the image’s domain Ω. 524

We note that this lemma does not give a characterization of the vector fields in VI. It gives

525

the inclusion: VI ⊂ {v|∀x ∈ Ω, ∇I(x).v(x) = 0}. Thus, it allows to control the complexity of

526

VI, and thus of some of the isotropy group’s Lie algebra. It represents a first step towards the

527

following conjecture. 528

Conjecture 1. The Lie algebra of the isotropy group GI of the brain image I is constituted

529

of vector fields that are everywhere perpendicular to the image’s gradient. 530

Sketch of Proof 1. The proof will study the relations between the following three sets of 531

vector fields: 532

- the Lie algebra gI of GI,

533

- the set VI= {v ∈ V s.t.: I ◦ Exp(tv) = I, ∀|t| < },

534

- the set V⊥= {v|∀x ∈ Ω, ∇I(x).v(x) = 0}.

535 536

• Proving g_I = VI by proving the double inclusion: (a) gI ⊂ VI and (b) VI⊂ gI.

537

The inclusion (a) may be difficult to prove directly because of the lack of characterization 538

of the infinite dimensional Lie algebra gI. The inclusion (b) could make use of the above

539

Lemma which shows that Exp(VI) ⊂ GI. We need to use the group logarithm in order to

540

transform an inclusion on Lie groups into an inclusion on Lie algebra. The group logarithm is 541

defined as the reciprocal of the group exponential, on its domain of bijectivity. As the domain 542

of bijectivity is not well characterized in the general case, it is difficult to make use of this. 543

• Proving V_I = V⊥ by proving the double inclusion: (a) VI ⊂ V⊥ and (b) V⊥⊂ VI.

544

The inclusion (a) is proven in Lemma 1. The inclusion (b) may be proven in two steps. 545

First, we would write the full Taylor expansion of I ◦ Exp(tv) where v is an element of V⊥and

(19)

show that the orders above the 1st give 0. At this point, the expression of the Taylor expansion 547

is unclear, as well as if the condition v ⊥ ∇I is sufficient to make the orders above the 1st 548

make 0. In a second step, we would need to translate the inclusion written in terms of Lie 549

groups as an inclusion written in terms of Lie algebra, which is not trivial according to the 550

paragraph above. 551

To understand the intuition behind this conjecture, consider a uniform image, i.e. with 552

a constant intensity. In this case, there is no restrictions a priori on the vector fields of the 553

Lie algebra of the image’s isotropy group. Thus, the isotropy group is as large as it can be. 554

We get that the isotropy group of an image with constant intensity is the whole group of 555

diffeomorphisms. 556

Intensity gradient and MS graph. We now present a lemma showing that the MS graph can 557

be used to computationally represent the isotropy group of an image. Take two images I1 and

558

I2. Assume that their MS graphs are the same, regardless of the nodes’ positions.

559

Lemma 2. Assume we can map the level sets of I1 to the level sets of I2. This implies that

560

we can map the partition of the domain I1 induced by its Morse-Smale to the partition of the

561

domain of I2.

562

Take a part w ⊂ Ω of this partition. There exists a diffeomorphism ψ and a function κ 563

such that: ∇I2(x) = κ(x).d∗ψ(x).∇I1◦ ψ(x), ∀x ∈ w.

564

Proof. We consider a part w ⊂ Ω of the partition of the domain of I1 induced by its

565

Morse-Smale complex. On this part, there exists a function f such that: 566

(9) I1 = f ◦ I2◦ ψ

567

The function f gives the mapping of intensity levels on each level set. Differentiating the 568

previous equation with the chain rule gives: 569

(10) dI1= df (I2◦ ψ).dI2◦ ψ.dψ

570

where df (I2◦ ψ) is a scalar which we note κ. Taking the adjoint gives:

571

(11) ∇I₁ = κ.∇I2◦ ψ.d∗ψ

572

This lemma is the first step towards the following conjecture. 573

Conjecture 2. Two images with same MS graphs have same isotropy group. 574

Sketch of Proof 2. The images I1 and I2 have the same MS graph. The graph of I1, taken

575

with the nodes and edges positions on Ω, can be diffeomorphically deformed on the graph of 576

I2. We take ψ1 a diffeomorphism that realizes the graphs’ matching.

577

Now, I1◦ ψ(−1)1 and I2 share the same MS graph, taken with the nodes and edges’ positions

578

on Ω. We consider one cell of this graph. 579

We consider the integral lines of the respective gradients ∇ψ1.∇I1◦ ψ1 and ∇I2 on the

580

cell. Both define a “parallel” partition of the cell. As a consequence, the set of integral lines 581

of the gradient of I1◦ ψ1(−1) can be mapped diffeomorphically to the set of integral lines of the

582

gradient of I2. We take ψ2 a diffeormorphism that realizes the matching of the integral lines.

(20)

At this step, the notion of parallel partition would need to be written as a definition, together 584

with the proof of the existence of ψ2.

585

Here, we would need to combine the ψ2’s obtained on the different cells of the MS complex

586

into one transformation on Ω that we also write ψ2. A study of the behavior of ψ2 on the

587

edges of the MS complex should show that ψ2 is itself a diffeomorphism on Ω. Then, we write

588

ψ = ψ2◦ ψ1. At the end of this step, I1 is transformed to I1◦ ψ(−1).

589

The gradients of I1◦ ψ(−1) and I2 share the same integral lines. Therefore ψ also maps

590

the level sets of I1 to the level sets of I2. Using Lemma 2, we conclude.

591

In others words: if two images I1, I2 have the same MS graph, then I1 can be

diffeo-592

morphically deformed so that its intensity gradient is parallel at every point to the intensity 593

gradient of I2.

594

From Lemma1, the sets {v1|∀x ∈ Ω, ∇I1(x).v1(x) = 0} and {v2|∀x ∈ Ω, ∇I2(x).v2(x) = 0}

595

control the isotropy groups of I1 and I2. If I1 and I2 have same MS graph, any vector field

596

in the first set can be diffeomorphically deformed to get a vector field in the second set, and 597

conversely. As a consequence, the MS graph represents the image’s isotropy group. From 598

Section1, we know that the isotropy group controls, in turn, the orbit’s type of the image i.e. 599

to which stratum the image belongs. 600

Furthermore, we note that we could have considered the labeled MS graph of the image, 601

i.e. the MS graph with intensities at the nodes, see Figure 9(c)(ii). The labeled MS graph 602

controls the orbit of the image: images in the same orbit have the same topology but also 603

same intensities. 604

3. Topology quantifies and controls the template’s asymptotic bias. This Section gath-605

ers the elements of Sections 1 and 2 to quantify the asymptotic bias in the brain template 606

computation. We use Morse-Smale complexes to quantify, and then control, the bias. 607

3.1. Quantify the template inconsistency.

608

Understand and estimate the geometric parameter d. The distance d is the distance of 609

the current image to a brain image with larger isotropy group, measured in sum of squared 610

differences of intensities, see Figure 6. How can we measure this distance d locally on the 611

template’s image? From Section1, we know that the isotropy group becomes larger when the 612

image is “more symmetric”. From Section2, we know that the isotropy group becomes larger 613

when the image topology becomes simpler. Thus, the distance d is a distance in intensity 614

from the template image to a similar image with simpler topology. 615

We want to express this distance locally on the template image. Modifying the intensity 616

locally on the template image modifies the image itself and may simplify its topology. For 617

example, modifying the intensity locally in a region of the image can suppress a min-max pair 618

and the image becomes “more symmetric”. Thus we describe the distance d locally on the 619

image by the amount of intensity needed to be changed in this region, so that the topology is 620

simplified. 621

We quantify the local intensity needed to simplify the template image’s topology using 622

the Morse-Smale complex representation of Section2. Let be given the Morse-Smale complex 623

of the template image. The intensity needed to simplify the image’s topology is, by definition, 624

the intensity needed to simplify the Morse-Smale graph. We consider the partition of the 625

(21)

image’s domain Ω induced by the Morse-Smale complex. For each region of the partition, 626

the intensity needed to simplify the topology can be represented by the amount of intensity 627

needed to remove the min-max pair of the region: 628

(12) d(region) = ˆˆ Tmax− ˆTmin = p_Tˆ(region)

629

This quantifies the importance of the region as a representative of the brain anatomy: if the 630

intensity difference between the region’s min and max is low, then one can assume that this 631

min-max pair has been created by chance because of the noise on the images. We see that 632

the notion of persistence defined in Section 2estimates the first geometric parameter d. 633

Understand and estimate the geometric parameter σ. Now we turn to the second geometric 634

parameter that causes the asymptotic bias: the standard deviation σ of the noise, see again 635

Equation3and Figure6. The standard deviation σ of the noise is a parameter of the generative 636

model that we assume has produced the observed images of the subjects brain anatomies. The 637

parameter σ is unknown but it can be estimated from the observed images. Since we want to 638

compute the asymptotic bias locally, we are interested in estimating the parameter σ locally, 639

and for example on a region of the Morse-Smale complex of the template image. We estimate 640

it as the average of the variability in intensity of the registered images in the region: 641 (13) σ(region) =ˆ 1 #x X x∈region ˆ σ(x), 642

where ˆσ(x) is the variability in intensity of the registered images at the voxel x, and serves as 643

an estimate of the noise at this voxel. This quantifies the amount of noise in this region. The 644

larger the standard deviation σ of the noise, the more chances for the template to show min-645

max pairs that appeared by chance. One could use other estimator of the standard deviation, 646

for example the sample standard deviation. Future work may investigate these estimators and 647

their impact on the estimator of the template’s bias. 648

Compute the asymptotic bias using the persistence of the whitened brain template. The local 649

estimates of the geometric parameters d and σ enable us to estimate the asymptotic bias 650

locally on a brain region: 651 (14) Bˆ∞(region) = ˆ_d(region) ˆ σ(region) !−2 , 652

We emphasize here that ˆB∞is an estimate of the asymptotic bias B∞(of the brain template

653

estimation), and not an exact computation. 654

We link the estimate ˆB∞ to the definition of persistence in the Morse-Smale complex

655

framework. First, we define the whitened brain template estimate ˆt of ˆT as: 656

(15) ∀x ∈ Ω, ˆ_{t(x) =} T (x)ˆ

ˆ σ(x). 657

In other words, we divide the brain template intensity of each voxel x by the estimation of the 658

standard deviation of the noise at this voxel ˆσ(x). This whitens the noise all over the brain 659

template. 660

(22)

We assume that the critical points of ˆt are close to the critical points of ˆT and consider the 661

Morse-Smale complex of the whitened template. We further assume that: ˆσregion ' ˆσ(max) '

662 ˆ

σ(min), where ˆσ(max), ˆσ(min) are the variabilities at the respective min and max of the 663

Morse-Smale complex. We can write: 664 (16) Bˆ∞(region) ' ˆ Tmax ˆ σ(max)− ˆ Tmin ˆ σ(min) !−2 = ˆt(max) − ˆt(min)−2 = pˆt(region) −2_, 665

where we recognize the persistence pˆt(region) of the corresponding region of the whitened

666

template ˆt. This links the estimation of the asymptotic bias to the persistence of the whitened 667

template’s Morse-Smale complex. This shows how a topological property of the image in fact 668

represents a statistical property of this image as the estimate of the brain template. 669

Hierarchy of the whitened template. The persistence of the whitened template quantifies 670

locally the asymptotic bias, i.e. how far the brain template is from the unique brain anatomy 671

of the generative model. Is there a statistical interpretation of the hierarchies of Morse-Smale 672

complexes, introduced in Section 2? Let us consider another Morse-Smale of the whitened 673

template’s hierarchy, i.e. a Morse-Smale computed at a given persistence threshold pthreshold.

674

There is an asymptotic bias threshold that corresponds, which we can write: p−1/2_threshold. The 675

regions kept in the new Morse-Smale are those having a persistence higher than the persistence 676

threshold pthreshold, i.e. those having an asymptotic bias lower than the asymptotic bias

677

threshold p−1/2_threshold. 678

Therefore, if we can impose the topology of the brain template to match the new Morse-679

Smale of threshold p−1/2_threshold, we control its asymptotic bias. This means that we preserve 680

only the min-max pairs shown on the Morse-Smale graph chosen. It eliminates the min-max 681

pairs that have been created by chance, because the noise on the images was at a similar level 682

than the intensity signal on these regions. The next subsection explains how to impose the 683

topology of a given Morse-Smale on the template’s image. 684

3.2. Controlling the template’s asymptotic bias by constraining its topology. We are 685

given the template’s image and we want to force its asymptotic bias to be below a threshold, 686

so that it is closer to estimating the anatomy of the database, i.e. the anatomy shared by 687

the subject brains. The development above suggests to compute the Morse-Smale complex 688

with a persistence threshold corresponding to the desired bias threshold. Then, enforcing 689

template’s topology to match the Morse-Smale complex will control its asymptotic bias. This 690

enforcement procedure is called “Topological denoising”. 691

3.2.1. Topological denoising. Topological denoising is a procedure for smoothing an im-692

age, like our template image, while preserving topological features [23,18]. The input of the 693

procedure is the intensity function defining the template ˆT : Ω → R and a MS complex with 694

intensity values at its nodes. Enforcing the template’s topology to match the MS complex 695

means that we compute ˆT0 : Ω → R which is a smoothed version of original template estimate 696

ˆ

T containing only the intensity min-max pairs specified by the MS complex chosen. ˆT0 should 697

be otherwise as close as possible to the original template estimate ˆT in terms of intensity. The 698

values and positions of the MS extrema are preserved, while all other extrema are removed 699

(23)

from the brain template estimate. Such procedure provides control over the topology of the 700

brain image ˆT . 701

Formally, the original Topological denoising problem is written as the minimization [23]: 702 argmin_T0 X xi∈Ω || ˆT (xi) − T0(xi)||2+ Z Ω ||∆T0||2 703

s.t. T0(xi) = ˆT (xi) for xi a node of the MS complex

704

T0(xj) > T0(xi) for xj neighbor of xi and xi minimum

705

T0(xj) < T0(xi) for xj neighbor of xi and xi maximum

706

T0(xi) > min neighbor xj

T0(xj) for xi not an extremum

707

T0(xi) < max neighbor xj

T0(xj) for xi not an extremum.

708 709

The non linear inequality constraints make this optimization problem hard to solve. The 710

solution suggested by [23] is to compute a representative function u that verifies the last four 711

inequality constraints. Given this function u, the topological denoising problem becomes: 712 argmin T0 X xi∈Ω || ˆT (xi) − T0(xi)||2+ Z Ω ||∆T0||2 713

s.t. T0(xi) = ˆT (xi) for xi a node of the MS complex

714

(T0(xi) − T0(xj))(u(xi) − u(xj)) > 0, for (xi, xj) a pair of neighbors,

715 716

where the last constraint means that the direction of T0 shall be aligned with the direction of 717

u. This alternative optimization problem is easily solved [23]. 718

The representative function u can be computed by solving the Dirichlet problem: 719 argmin u Z Ω ||∇u||2 720

s.t. u(xi) = 0 for xi a minimum

721

u(xi) = 1 for xi a maximum.

722 723

Minimizers of the Dirichlet energy are harmonic functions, and their properties guarantee 724

that xi and xj are minima and maxima and that u contains no other extrema inside the MS

725

regions. We refer to [23] for further details. 726

Figure 10shows examples of topological denoising. The topology to be enforced is repre-727

sented by the red and blue dots, which are nodes of the MS complex: red for intensity maxima 728

and blue for intensity minima. On the left example, the circle motifs that were inducing un-729

desirable minima and maxima are removed. On the right example, two of the initial four 730

maxima in the center of the image are removed too. Only the topology dictated by the input 731

Morse-Smale complex is preserved. 732

3.2.2. Integrating the topological denoising in the template computation pipeline.

733

The original template’s computation is performed with the algorithm of [24] and use the LCC 734

(24)

Figure 10. Topological denoising on two toy examples. We impose topological constraints on the initial images, on the left in both cases: minima in blue and maxima in red. The arrows denote the action of the topological denoising and point to the output image.

Log-demons for the registrations [29]. We adapt it by adding a Topological denoising step, in 735

order to control the template’s asymptotic bias. 736

Algorithm 1 shows the adapted procedure. One initiates with the template being one of 737

the subject images: ˆT1 = I1. At each iteration k of the template’s computation, one registers

738

the subject images to the current template ˆTk and performs the average of the registered

739

images’ intensities to get a first version of the updated template ˆTk+1. So far, this matches

740

the usual template estimation procedure. Our adaptation is what follows. The MS complex 741

of the updated template ˆTk+1 is computed, using the R package msr [16]. Then, the updated

742

template ˆTk+1is smoothed using Topological denoising, see Figure12. These steps are iterated

743

until convergence. 744

Algorithm 1 Controlled brain template estimation

Input: Images {Ii}ni=1, noise variance σ2, persistence threshold pthreshold

Initialization: ˆ

T1= I1 (one of the subjects images)

k = 1 Repeat:

Non-linearly register the images to ˆTk, i.e. compute φik: Jki ' Ii◦ φik

Compute the mean deformation: ¯φk

Register subject image: Li_k= Ii◦ φi k◦ ¯φ

−1 k

Compute the mean intensity image for template iteration: ˆTk+1 = _n1

Pn

i=1Lik

Compute the MS complex of ˆTk+1 at persistence level p

Topological denoising of Tk+1 using the MS complex

k ← k + 1

until convergence: || ˆTk− ˆTk+1|| <

Output: ˆTk

The main parameter controlling this adapted procedure is the asymptotic bias threshold, 745

i.e. the persistence threshold pthreshold for the MS complex computation. The next section

746

discusses the choice of this parameter pthreshold. Varying the threshold pthreshold leads to the

747

construction of a hierarchy of templates. The other parameter is σ, which is the noise on the 748