• Aucun résultat trouvé

Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata

N/A
N/A
Protected

Academic year: 2022

Partager "Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata"

Copied!
4
0
0

Texte intégral

(1)

Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata

Pascal Vincent

D´epartement d’Informatique et Recherche Op´erationnelle Universit´e de Montr´eal

P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7, Qc, Canada [email protected]

Technical Report 1259

D´epartement d’Informatique et Recherche Op´erationnelle March 16, 2005

Abstract

The aim of this report is to correct an inconsistency in the mathematical formulas that appeared in our previously publishedManifold Parzenarticle [1], regarding the computation of the density of “oriented” high dimensional Gaussian “pancakes”

for which we store only the firstdleading eigenvectors and eigenvalues (rather than a fulln×ncovariance matrix, or its inverse). We give a detailed derivation leading to the correct formulas.

1 Detailed mathematical derivation of LocalGaussian evaluation

We consider, inIRn, the multivariate Gaussian densityNµ,C parameterized by mean vectorµ ∈ IRnandn×ncovariance matrixC. The density at any pointx ∈ IRnis given by

Nµ,C(x) = 1

p(2π)n|C|e12(x−µ)0C−1(x−µ) (1) where|C|is the determinant ofC.

Letx˜=x−µ

LetC = V DV0 the eigen-decomposition ofC where the columns ofV are the or- thonormal eigenvectors andDis a diagonal matrix with the eigenvalues sorted in de- creasing order(λ1, . . . , λn).

1

(2)

We replaceDwithD˜which is a diagonal matrix containing modified eigenvaluesλ˜1..n

such thatall eigenvalues afterλ˜dare given the same valueσ2, i.e.

˜λ1..n = (˜λ1, . . . ,λ˜d, σ2, . . . , σ2)

UsingC˜ =VDV˜ 0instead ofCin equation 1, we get:

µ,C(x) = 1 q

(2π)n|C|˜

e12˜x0C˜−1˜x

= e12log((2π)n|C|)˜ e12˜x0C˜−1˜x

= e−0.5(nlog(2π)+log(|C|)+˜˜ x0C˜−1x)˜

(2)

In other words N˜µ,C(x) =e−0.5(r+q) (3)

with r=nlog(2π) +log(|C|)˜ (4)

and q= ˜x0−1x˜ (5)

Moreover, sinceV is an orthonormal basis, we have kV0xk˜ 2 = kxk˜ 2

n

X

i=1

(Vi0x)˜ 2 = kxk˜ 2 (6)

whereVi0x˜is the usual dot product between theith eigen-vector and centered inputx.˜

Proof : kV0xk˜ 2 = (V0˜x)0(V0x)˜

= ˜x0V V0˜x

= ˜x0Ix˜

= k˜xk2

In addition, having the above eigendecomposition,

we have |C|˜ =

n

Y

i=1

˜λi

thus log(|C|)˜ =

n

X

i=1

log(˜λi)

2

(3)

log(|C|)˜ = (n−d)log(σ2) +

d

X

i=1

log(˜λi) (7)

Replacing 7 in 4, we get

r=nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(˜λi) (8)

In addition we have q = x˜0−1

= x˜0(VDV˜ 0)−1

= x˜0VD˜−1V0

=

n

X

i=1

1 λ˜i

(Vi0x)˜ 2

=

n

X

i=1

(1

˜λi

− 1

σ2)(Vi0x)˜ 2

! + 1

σ2

n

X

i=1

(Vi0x)˜ 2

Since˜λi2for alli > d,(˜1

λiσ12) = 0fori > d. As a consequence the first sum can be replaced by a sum from1tod(instead of from1ton). Also from equation 6 the second sum can be replaced byk˜xk2. This yields:

q= 1

σ2kxk˜ 2+

d

X

i=1

(1 λ˜i

− 1

σ2)(Vi0x)˜ 2 (9)

We have thus eliminated the need to store and compute with eigenvectorsVd+1. . . Vn.

2 Summing up: the correct formulas

To sum this all up, we can compute the density as follows:

µ,C(x) =e−0.5(r+q) (10)

with r=nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(˜λi) (11)

and q= 1

σ2k˜xk2+

d

X

i=1

(1 λ˜i

− 1

σ2)(Vi0x)˜ 2 (12)

3

(4)

3 Errata for Manifold Parzen article

Typo in the pseudo-code forMparzen::trainstep 4) should beλij = σ2+ s

2 j

k

instead ofσ2+s

2 j

l.

Also, in our initial experiments, we actually considered two possible choices forλ˜i(for i≤d) andσ2:

a)(˜λ1, . . . ,λ˜d) = (λ1, . . . , λd)andσ2d+1 which leads to:

r = nlog(2π) + (n−d) log(σ2) +

d

X

i=1

log(λi)

q = 1

σ2k˜xk2+

d

X

i=1

(1 λi − 1

σ2)(Vi0x)˜ 2

b)σ2is a user specified value and(˜λ1, . . . ,λ˜d) = (λ12, . . . , λd2) which leads to:

r = nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(λi2)

q = 1

σ2k˜xk2+

d

X

i=1

( 1

λi2− 1

σ2)(Vi0x)˜ 2

We mentioned only scenariob)in the Manifold Parzen article [1] (due to space con- straints). But somehow these two slightly different versions got mixed up in the write- up, leading to the somewhat inconsistent formulas that appear in the article (taking r from b) and q from a)). In addition, we mistakenly wrote dlog(2π)instead of nlog(2π).

However, after verification, the actualcodeused to perform the experiments reported in the article (implementing scenariob)) appears correct.

References

[1] Pascal Vincent and Yoshua Bengio. Manifold parzen windows. In S. Thrun S. Becker and K. Obermayer, editors,Advances in Neural Information Process- ing Systems 15, pages 825–832, Cambridge, MA, 2003. MIT Press.

4

Références

Documents relatifs

Dans ces trois mode`les, les personnels de sante´ ont un risque significativement plus faible de de´ce´der comparativement aux ve´te´rans de la marine nationale, aux personnels

Figure 1: The MWS diagram Figure 2: The three vertical planes in the MWS To grasp the specific activity of students solving problems in mathematics, the two epistemological

When a noun phrase is annotated, the borders of the head word are highlighted with HeadBegin and HeadEnd attributes, its grammatical form is normalized (reduced to

Then, we re-order the codebook with MM ordering criterion that correspond to total orderings and enable to obtain complete lattices: lexicographic ordering [16], lexicographic

To per- form a complete lattice learning with manifold learning, a vertex is associated to each input vector data and a neigh- borhood graph is constructed.. Then, we consider only

Now if the true density that we want to model is indeed “close to” a non-linear lower di- mensional manifold embedded in the higher dimensional input space, in the sense that most

The next experiment is meant to show both qualitatively and quantitatively the power of non-local learning, by using 9 classes of rotated digit images (from 729 first examples of

In the announcement for his studio, Vey stated that he would do photographs, tintypes, picture framing, the “celebrated” $8.00 oil paintings in heavy gold frames, and that he