Geodesic Regression - Thomas Fletcher - Geodesic Regression and Its Application to Shape Analys

Geodesic Regression and Its Application to Shape Analysis

P. Thomas Fletcher

2.3 Geodesic Regression

Lety1; : : : ; yN be points on a smooth Riemannian manifoldM, with associated scalar valuesx1; : : : ; xN 2 R. The goal of geodesic regression is to find a geodesic curve#onM that best models the relationship between thexiand theyi. Just as in linear regression, the speed of the geodesic will be proportional to the independent parameter corresponding to the xi. Estimation will be set up as a least-squares problem, where we want to minimize the sum-of-squared Riemannian distances between the model and the data. A schematic of the geodesic regression model is shown in Fig.2.1.

Before formulating the model, we review a few basic concepts of Riemannian geometry. We will write an element of the tangent bundle as the pair.p;v/2TM,

Fig. 2.1 Schematic of the geodesic regression model

wherep is a point inM andv2TpM is a tangent vector atp. Recall that for any .p;v/2 TM there is a unique geodesic curve#, with initial conditions#.0/D p and#⁰.0/Dv. This geodesic is only guaranteed to exist locally. When#is defined over the intervalå 0; 1$, the exponential map atp is defined as Exp_p.v/ D #.1/.

In other words, the exponential map takes a position and velocity as input and returns the point at time1 along the geodesic with these initial conditions. The exponential map is locally diffeomorphic onto a neighborhood of p. Let V .p/

be the largest such neighborhood. Then within V .p/the exponential map has an inverse, the Riemannian log map, Log_p WV .p/!TpM. For any pointq 2V .p/

the Riemannian distance function is given by d.p; q/ D kLog_p.q/k. It will be convenient to include the pointpas a parameter in the exponential and log maps, i.e., define Exp.p;v/DExp_p.v/and Log.p; q/DLog_p.q/.

Notice that the tangent bundleTMserves as a convenient parameterization of the set of possible geodesics onM. An element.p;v/2 TM provides an interceptp and a slopev, analogous to the and parameters in the multiple linear regression model (2.1). In fact, is a vector in the tangent spaceT RⁿŠRⁿ, and thus. ; /is an element of the tangent bundleTRⁿ. Now consider anM-valued random variable Yand a non-random variableX2R. The generalization of the multiple linear model to the manifold setting is thegeodesic model,

Y DExp.Exp.p; Xv/;"/; (2.3)

where" is a random variable taking values in the tangent space at Exp.p; Xv/.

Notice that for Euclidean space, the exponential map is simply addition, i.e., Exp.p;v/DpCv. Thus, the geodesic model coincides with (2.1) whenM DRⁿ.

2.3.1 Least Squares Estimation

Consider a realization of the model (2.3):.x_i; y_i/ 2 R!M, for i D 1; : : : ; N. Given this data, we wish to find estimates of the parameters.p;v/ 2 TM. First, define the sum-of-squared error of the data from the geodesic given by.p;v/as

E.p;v/D 1 2

iD1

d.Exp.p; xiv/; y_i/²: (2.4)

dpExp dvExp Fig. 2.2 Jacobi fields as derivatives of the exponential map

Following the ordinary least squares minimization problem given by (2.2), we formulate a least squares estimator of the geodesic model as a minimizer of the above sum-of-squares energy, i.e.,

.p;O Ov/Darg min

.p;v/E.p;v/: (2.5)

Again, notice that this problem coincides with the ordinary least squares problem whenM DRⁿ.

Unlike the linear setting, the least squares problem in (2.5) for a general manifold Mwill typically not yield an analytic solution. Instead we derive a gradient descent algorithm. Computation of the gradient of (2.4) will require two parts: the derivative of the Riemannian distance function and the derivative of the exponential map.

Fixing a pointp2M, the gradient of the squared distance function isrxd.p; x/²D

"2Log_x.p/forx2V .p/.

The derivative of the exponential map Exp.p;v/can be separated into a derivative with respect to the initial point p and a derivative with respect to the initial velocityv. To do this, first consider a variation of geodesics given byc₁.s; t / D Exp.Exp.p; su₁/; tv.s//, whereu₁ 2 T_pM defines a variation of the initial point along the geodesic%.s/DExp.p; su₁/. Here we have also extendedv2T_pM to a vector fieldv.s/along%via parallel translation. This variation is illustrated on the left side of Fig.2.2. Next consider a variation of geodesicsc₂.s; t /DExp.p; su₂C tv/, where u₂ 2 T_pM. (Technically, u₂ is a tangent to the tangent space, i.e., an element ofT_v.T_pM /, but there is a natural isomorphismT_v.T_pM / Š T_pM.) The variation c₂ produces a “fan” of geodesics as seen on the right side of Fig.2.2.

Now the derivatives of Exp.p;v/with respect topandvare given by dpExp.p;v/$u₁D d

dsc1.s; t /

sD0 DJ1.1/

d_vExp.p;v/$u₂D d dsc₂.s; t /

sD0 DJ₂.1/;

whereJ_i.t /are Jacobi fieldsalong the geodesic#.t / D Exp.p; tv/. Jacobi fields are solutions to the second order equation

D²

dt²J.t /CR.J.t /;#⁰.t //#⁰.t /D0; (2.6) whereRis the Riemannian curvature tensor. For more details on the derivation of the Jacobi field equation and the curvature tensor, see for instance [3]. The initial conditions for the two Jacobi fields above areJ₁.0/Du₁; J₁⁰.0/D0andJ₂.0/D0, J₂⁰.0/ D u₂, respectively. If we decompose the Jacobi field into a component tangential to# and a component orthogonal, i.e., J D J^>CJ^?, the tangential component is linear:J^>.t /D u^>₁ Ctu^>₂. Therefore, the only challenge is to solve for the orthogonal component.

Finally, the gradient of the sum-of-squares energy in (2.4) is given by

rpE.p;v/D"

iD1

dpExp.p; xiv/^&Log.Exp.p; xiv/; y_i/;

rvE.p;v/D"

iD1

x_id_vExp.p; x_iv/^&Log.Exp.p; x_iv/; y_i/;

where we have taken the adjoint of the exponential map derivative, e.g., defined byhd_pExp.p;v/u;wi D hu; d_pExp.p;v/^&wi. As we will see in the next section, formulas for Jacobi fields and their respective adjoint operators can often be derived analytically for many useful manifolds.

2.3.2 R

Statistics and Hypothesis Testing

In regression analysis the most basic question one would like to answer is whether the relationship between the independent and dependent variables is significant.

A common way to test this is to see if the amount of variance explained by the model is high. For geodesic regression we will measure the amount of explained variance using a generalization of the R² statistic, or coefficient of determination, to the manifold setting. To do this, we first define predicted values ofy_iand the errors"_ias

y_i DExp.p; xO _iv/;O O

"_i DLog.yO_i; y_i/;

where .p;O v/O are the least squares estimates of the geodesic parameters defined above. Note that theyOi are points along the estimated geodesic that are the best predictions of the yi given only the xi. The"Oi are the residuals from the model predictions to the true data.

Now to define the total variance of data,y₁; : : : ; y_N 2 M, we use the Fr´echet

The unexplained variance is the variance of the residuals, var."Oi/ D _N¹ P kO"ik². From the definition of the residuals, it can be seen that the unexplained variance is the mean squared distance of the data to the model, i.e., var."Oi/D _N¹ P

d.yOi; yi/². Using these two variance definitions, the generalization of theR² statistic is then given by

R²D1"unexplained variance

total variance D1"var."Oi/

var.yi/: (2.7) Fr´echet variance coincides with the standard definition of variance whenM DRⁿ. Therefore, it follows that the definition ofR²in (2.7) coincides with theR²for linear regression whenM D Rⁿ. Also, because Fr´echet variance is always nonnegative, we see thatR² % 1, and thatR² D 1 if and only if the residuals to the model are exactly zero, i.e., the model perfectly fits the data. Finally, it is clear that the residual variance is always smaller than the total variance, i.e., var."O_i/% var.yi/.

This is because we could always choosepOto be the Fr´echet mean and v D 0 to achieve var."O_i/D var.yi/. Therefore,R² & 0, and it must lie in the intervalŒ0; 1$, as is the case for linear models.

We now describe a permutation test for testing the significance of the estimated slope term,v. Notice that if we constrainO vto be zero in (2.5), then the resulting least squares estimate of the intercept,p;O will be the Fr´echet mean of they_i. The desired hypothesis test is whether the fraction of unexplained variance is significantly decreased by also estimatingv. The null hypothesis isH₀ W R² D 0, which is the case if the unexplained variance in the geodesic model is equal to the total variance.

Under the null hypothesis, there is no relationship between theX variable and the Y variable. Therefore, thex_i are exchangeable under the null hypothesis, and a per-mutation test may randomly reorder thex_idata, keeping they_ifixed. Estimating the geodesic regression parameters for each random permutation of thex_i, we can cal-culate a sequence ofR²values,R₁²; : : : ; R²_m, which approximate the sampling distri-bution of theR²statistic under the null hypothesis. Computing the fraction of theR²_k that are greater than theR²estimated from the unpermuted data gives us ap-value.

Dans le document Mathematics and Visualization (Page 61-65)