APPENDIX: DERIVATION OF HYPERSPHERE VOLUME

x^Txfollows aχ²distribution withddegrees of freedom, which has meandand variance 2d. It follows that the mean and variance of the random variabler²is

µ_r2=d σ_r²₂ =2d

By the central limit theorem, asd→ ∞,r²is approximately normal with meand and variance 2d, which implies that r² is concentrated about its mean value ofd. As a consequence, the distance r of a point x to the center of the standard multivariate normal is likewise approximately concentrated around its mean√

Next, to estimate the spread of the distanceraround its mean value, we need to derive the standard deviation ofr from that ofr². Assuming thatσ_r is much smaller compared tor, then using the fact that ^d^logr_dr =¹r, after rearranging the terms, we have

r =dlogr

=1 2dlogr²

Using the fact that^d^logr_dr₂² =_r¹2, and rearranging the terms, we obtain dr

r =1 2

dr² r²

which implies thatdr=2r¹dr². Setting the change inr²equal to the standard deviation ofr², we havedr²=σ_r2=√

2d, and setting the mean radiusr=√

d, we have σr=dr= 1

2√ d

√2d= 1

√2

We conclude that for larged, the radiusr (or the distance of a pointxfrom the origin0) follows a normal distribution with mean√

d and standard deviation 1/√ 2.

Nevertheless, the density at the mean distance√

d, is exponentially smaller than that at the peak density because

f (x)

f (0)=exp

−x^Tx/2 =exp{−d/2}

Combined with the fact that the probability mass migrates away from the mean in high dimensions, we have another interesting observation, namely that, whereas the density of the standard multivariate normal is maximized at the center0, most of the probability mass (the points) is concentrated in a small band around the mean distance of√

dfrom the center.

6.7 APPENDIX: DERIVATION OF HYPERSPHERE VOLUME

The volume of the hypersphere can be derived via integration using spherical polar coordinates. We consider the derivation in two and three dimensions, and then for a generald.

X₁ X2

θ₁ r

(x1, x2)

Figure 6.9. Polar coordinates in two dimensions.

Volume in Two Dimensions

As illustrated in Figure 6.9, ind =2 dimensions, the point x=(x₁, x₂)∈R² can be expressed in polar coordinates as follows:

x₁=rcosθ₁=rc₁ x₂=rsinθ₁=rs₁

wherer= kxk, and we use the notation cosθ₁=c₁and sinθ1=s₁for convenience.

TheJacobian matrixfor this transformation is given as

J(θ₁)=

∂x₁

∂r

∂x₁

∂θ₁

∂x₂

∂r

∂x₂

∂θ₁

c₁ −rs₁ s₁ rc₁

The determinant of the Jacobian matrix is called theJacobian. ForJ(θ1), the Jacobian is given as

det(J(θ1))=rc₁²+rs₁²=r(c₁²+s₁²)=r (6.15) Using the Jacobian in Eq. (6.15), the volume of the hypersphere in two dimensions can be obtained by integration overrandθ₁(withr >0, and 0≤θ₁≤2π)

vol(S2(r))= Z

θ₁

det(J(θ1)) dr dθ₁

= Zr

Z2π 0

r dr dθ₁= Zr 0

r dr Z2π

dθ₁

= r² 2

·θ₁ ^2π

0 =π r²

6.7 Appendix: Derivation of Hypersphere Volume 177

bC(x1, x2, x3)

r θ₁ θ₂

Figure 6.10.Polar coordinates in three dimensions.

Volume in Three Dimensions

As illustrated in Figure 6.10, ind=3 dimensions, the pointx=(x₁, x₂, x₃)∈R³can be expressed in polar coordinates as follows:

x₁=rcosθ1cosθ₂=rc₁c₂ x₂=rcosθ1sinθ₂=rc₁s₂ x₃=rsinθ₁=rs₁

wherer= kxk, and we used the fact that the dotted vector that lies in theX₁–X2plane in Figure 6.10 has magnitudercosθ₁.

The Jacobian matrix is given as

J(θ1, θ2)=







∂x₁

∂r

∂x₁

∂θ₁

∂x₁

∂θ₂

∂x₂

∂r

∂x₂

∂θ₁

∂x₂

∂θ₂

∂x₃

∂r

∂x₃

∂θ₁

∂x₃

∂θ₂





=





c₁c₂ −rs₁c₂ −rc₁s₂ c₁s₂ −rs₁s₂ rc₁c₂

s₁ rc₁ 0





The Jacobian is then given as

det(J(θ1, θ₂))=s₁(−rs₁)(c₁)det(J(θ2))−rc₁c₁c₁det(J(θ2))

= −r²c₁(s₁²+c²₂)= −r²c₁ (6.16) In computing this determinant we made use of the fact that if a column of a matrixA is multiplied by a scalar s, then the resulting determinant issdet(A). We also relied on the fact that the(3,1)-minorof J(θ₁, θ₂), obtained by deleting row 3 and column 1 is actually J(θ2) with the first column multiplied by −rs₁ and the second column

multiplied byc₁. Likewise, the(3,2)-minor of J(θ₁, θ₂))isJ(θ₂)with both the columns multiplied byc₁.

The volume of the hypersphere ford=3 is obtained via a triple integral withr >0,

−π/2≤θ₁≤π/2, and 0≤θ₂≤2π

Volume indDimensions

Before deriving a general expression for the hypersphere volume ind dimensions, let us consider the Jacobian in four dimensions. Generalizing the polar coordinates from three dimensions in Figure 6.10 to four dimensions, we obtain

x₁=rcosθ1cosθ2cosθ₃=rc₂c₂c₃ x₂=rcosθ1cosθ2sinθ₃=rc₁c₂s₃ x₃=rcosθ1sinθ₂=rc₁s₁ x₄=rsinθ₁=rs₁ The Jacobian matrix is given as

J(θ₁, θ₂, θ₃)=

Utilizing the Jacobian in three dimensions [Eq. (6.16)], the Jacobian in four dimensions is given as

det(J(θ1, θ₂, θ₃))=s₁(−rs₁)(c₁)(c₁)det(J(θ2, θ₃))−rc₁(c₁)(c₁)(c₁)det(J(θ2, θ₃))

=r³s₁²c²₁c₂+r³c⁴₁c₂=r³c₁²c₂(s₁²+c²₁)=r³c²₁c₂

Jacobian indDimensions By induction, we can obtain thed-dimensional Jacobian as follows:

det(J(θ1, θ₂, . . . , θ_d−1))=(−1)^dr^d⁻¹c^d₁⁻²c^d₂⁻³. . . c_d−2

6.7 Appendix: Derivation of Hypersphere Volume 179 The volume of the hypersphere is given by the d-dimensional integral with r >0,

−π/2≤θ_i≤π/2 for alli=1, . . . , d−2, and 0≤θ_d−1≤2π:

Consider one of the intermediate integrals:

Zπ/2 Substituting Eq. (6.20) in Eq. (6.19), we get the new integral:

whereB(α, β)is thebeta function, given as B(α, β)=

Z1 0

u^α−1(1−u)^β−1du

and it can be expressed in terms of the gamma function [Eq. (6.6)] via the identity B(α, β)=Ŵ(α)Ŵ(β) which matches the expression in Eq. (6.4).

6.8 FURTHER READING

For an introduction to the geometry ofd-dimensional spaces see Kendall (1961) and also Scott (1992, Section 1.5). The derivation of the mean distance for the multivariate normal is from MacKay (2003, p. 130).

Kendall, M. G. (1961).A Course in the Geometry ofnDimensions. New York: Hafner.

MacKay, D. J. (2003). Information Theory, Inference and Learning Algorithms.

New York: Cambridge University Press.

Scott, D. W. (1992).Multivariate Density Estimation: Theory, Practice, and Visualiza-tion. New York: John Wiley & Sons.

6.9 EXERCISES

Q1. Given the gamma function in Eq. (6.6), show the following:

(a) Ŵ(1)=1 (b) Ŵ

1 2

=√π

(c) Ŵ(α)=(α−1)Ŵ(α−1)

Q2. Show that the asymptotic volume of the hypersphereS_d(r)for any value of radiusr eventually tends to zero asdincreases.

Q3. The ball with centerc∈R^dand radiusris defined as B_d(c, r)=

x∈R^d|δ(x,c)≤r

where δ(x,c) is the distance between x and c, which can be specified using the L_p-norm:

L_p(x,c)= Xd

i=1

|xi−ci|^p

!¹_p

where p6= 0 is any real number. The distance can also be specified using the L_∞-norm:

L_∞(x,c)=max

|x_i−c_i|

Answer the following questions:

(a) Ford=2, sketch the shape of the hyperball inscribed inside the unit square, using theL_p-distance withp=0.5 and with centerc=(0.5,0.5)^T.

(b) Withd=2 andc=(0.5,0.5)^T, using theL_∞-norm, sketch the shape of the ball of radiusr=0.25 inside a unit square.

(c) Compute the formula for the maximum distance between any two points in the unit hypercube in d dimensions, when using the L_p-norm. What is the maximum distance forp=0.5 whend=2? What is the maximum distance for the L_∞-norm?

6.9 Exercises 181

ǫ ǫ

Figure 6.11. For Q4.

Q4. Consider the corner hypercubes of length ǫ ≤ 1 inside a unit hypercube. The 2-dimensional case is shown in Figure 6.11. Answer the following questions:

(a) Letǫ=0.1. What is the fraction of the total volume occupied by the corner cubes in two dimensions?

(b) Derive an expression for the volume occupied by all of the corner hypercubes of lengthǫ <1 as a function of the dimensiond. What happens to the fraction of the volume in the corners asd→ ∞?

(c) What is the fraction of volume occupied by the thin hypercube shell of widthǫ <1 as a fraction of the total volume of the outer (unit) hypercube, asd→ ∞? For example, in two dimensions the thin shell is the space between the outer square (solid) and inner square (dashed).

Q5. Prove Eq. (6.14), that is, limd→∞P x^Tx≤ −2 ln(α)

→0, for anyα∈(0,1)andx∈R^d. Q6. Consider the conceptual view of high-dimensional space shown in Figure 6.4. Derive an expression for the radius of the inscribed circle, so that the area in the spokes accurately reflects the difference between the volume of the hypercube and the inscribed hypersphere ind dimensions. For instance, if the length of a half-diagonal is fixed at 1, then the radius of the inscribed circle is √¹

2in Figure 6.4a.

Q7. Consider the unit hypersphere (with radiusr=1). Inside the hypersphere inscribe a hypercube (i.e., the largest hypercube you can fit inside the hypersphere). An example in two dimensions is shown in Figure 6.12. Answer the following questions:

Figure 6.12. For Q7.

(a) Derive an expression for the volume of the inscribed hypercube for any given dimensionalityd. Derive the expression for one, two, and three dimensions, and then generalize to higher dimensions.

(b) What happens to the ratio of the volume of the inscribed hypercube to the volume of the enclosing hypersphere as d→ ∞? Again, give the ratio in one, two and three dimensions, and then generalize.

Q8. Assume that a unit hypercube is given as [0,1]^d, that is, the range is [0,1] in each dimension. The main diagonal in the hypercube is defined as the vector from(0,0)= (

d−1

z }| {

0, . . . ,0,0)to(1,1)=(

d−1

z }| {

1, . . . ,1,1). For example, whend=2, the main diagonal goes from (0,0) to(1,1). On the other hand, the main anti-diagonal is defined as the vector from (1,0)=(

d−1

z }| {

1, . . . ,1,0)to(0,1)=(

d−1

z }| {

0, . . . ,0,1) For example, for d =2, the anti-diagonal is from(1,0)to(0,1).

(a) Sketch the diagonal and anti-diagonal ind=3 dimensions, and compute the angle between them.

(b) What happens to the angle between the main diagonal and anti-diagonal asd→

∞. First compute a general expression for theddimensions, and then take the limit asd→ ∞.

Q9. Draw a sketch of a hypersphere in four dimensions.

Dans le document DATA MINING AND ANALYSIS (Page 187-195)