• Aucun résultat trouvé

Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata

N/A
N/A
Protected

Academic year: 2022

Partager "Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata"

Copied!
4
0
0

Texte intégral

(1)

Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata

Pascal Vincent

D´epartement d’Informatique et Recherche Op´erationnelle Universit´e de Montr´eal

P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7, Qc, Canada vincentp@iro.umontreal.ca

Technical Report 1259

D´epartement d’Informatique et Recherche Op´erationnelle March 16, 2005

Abstract

The aim of this report is to correct an inconsistency in the mathematical formulas that appeared in our previously publishedManifold Parzenarticle [1], regarding the computation of the density of “oriented” high dimensional Gaussian “pancakes”

for which we store only the firstdleading eigenvectors and eigenvalues (rather than a fulln×ncovariance matrix, or its inverse). We give a detailed derivation leading to the correct formulas.

1 Detailed mathematical derivation of LocalGaussian evaluation

We consider, inIRn, the multivariate Gaussian densityNµ,C parameterized by mean vectorµ ∈ IRnandn×ncovariance matrixC. The density at any pointx ∈ IRnis given by

Nµ,C(x) = 1

p(2π)n|C|e12(x−µ)0C−1(x−µ) (1) where|C|is the determinant ofC.

Letx˜=x−µ

LetC = V DV0 the eigen-decomposition ofC where the columns ofV are the or- thonormal eigenvectors andDis a diagonal matrix with the eigenvalues sorted in de- creasing order(λ1, . . . , λn).

1

(2)

We replaceDwithD˜which is a diagonal matrix containing modified eigenvaluesλ˜1..n

such thatall eigenvalues afterλ˜dare given the same valueσ2, i.e.

˜λ1..n = (˜λ1, . . . ,λ˜d, σ2, . . . , σ2)

UsingC˜ =VDV˜ 0instead ofCin equation 1, we get:

µ,C(x) = 1 q

(2π)n|C|˜

e12˜x0C˜−1˜x

= e12log((2π)n|C|)˜ e12˜x0C˜−1˜x

= e−0.5(nlog(2π)+log(|C|)+˜˜ x0C˜−1x)˜

(2)

In other words N˜µ,C(x) =e−0.5(r+q) (3)

with r=nlog(2π) +log(|C|)˜ (4)

and q= ˜x0−1x˜ (5)

Moreover, sinceV is an orthonormal basis, we have kV0xk˜ 2 = kxk˜ 2

n

X

i=1

(Vi0x)˜ 2 = kxk˜ 2 (6)

whereVi0x˜is the usual dot product between theith eigen-vector and centered inputx.˜

Proof : kV0xk˜ 2 = (V0˜x)0(V0x)˜

= ˜x0V V0˜x

= ˜x0Ix˜

= k˜xk2

In addition, having the above eigendecomposition,

we have |C|˜ =

n

Y

i=1

˜λi

thus log(|C|)˜ =

n

X

i=1

log(˜λi)

2

(3)

log(|C|)˜ = (n−d)log(σ2) +

d

X

i=1

log(˜λi) (7)

Replacing 7 in 4, we get

r=nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(˜λi) (8)

In addition we have q = x˜0−1

= x˜0(VDV˜ 0)−1

= x˜0VD˜−1V0

=

n

X

i=1

1 λ˜i

(Vi0x)˜ 2

=

n

X

i=1

(1

˜λi

− 1

σ2)(Vi0x)˜ 2

! + 1

σ2

n

X

i=1

(Vi0x)˜ 2

Since˜λi2for alli > d,(˜1

λiσ12) = 0fori > d. As a consequence the first sum can be replaced by a sum from1tod(instead of from1ton). Also from equation 6 the second sum can be replaced byk˜xk2. This yields:

q= 1

σ2kxk˜ 2+

d

X

i=1

(1 λ˜i

− 1

σ2)(Vi0x)˜ 2 (9)

We have thus eliminated the need to store and compute with eigenvectorsVd+1. . . Vn.

2 Summing up: the correct formulas

To sum this all up, we can compute the density as follows:

µ,C(x) =e−0.5(r+q) (10)

with r=nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(˜λi) (11)

and q= 1

σ2k˜xk2+

d

X

i=1

(1 λ˜i

− 1

σ2)(Vi0x)˜ 2 (12)

3

(4)

3 Errata for Manifold Parzen article

Typo in the pseudo-code forMparzen::trainstep 4) should beλij = σ2+ s

2 j

k

instead ofσ2+s

2 j

l.

Also, in our initial experiments, we actually considered two possible choices forλ˜i(for i≤d) andσ2:

a)(˜λ1, . . . ,λ˜d) = (λ1, . . . , λd)andσ2d+1 which leads to:

r = nlog(2π) + (n−d) log(σ2) +

d

X

i=1

log(λi)

q = 1

σ2k˜xk2+

d

X

i=1

(1 λi − 1

σ2)(Vi0x)˜ 2

b)σ2is a user specified value and(˜λ1, . . . ,λ˜d) = (λ12, . . . , λd2) which leads to:

r = nlog(2π) + (n−d)log(σ2) +

d

X

i=1

log(λi2)

q = 1

σ2k˜xk2+

d

X

i=1

( 1

λi2− 1

σ2)(Vi0x)˜ 2

We mentioned only scenariob)in the Manifold Parzen article [1] (due to space con- straints). But somehow these two slightly different versions got mixed up in the write- up, leading to the somewhat inconsistent formulas that appear in the article (taking r from b) and q from a)). In addition, we mistakenly wrote dlog(2π)instead of nlog(2π).

However, after verification, the actualcodeused to perform the experiments reported in the article (implementing scenariob)) appears correct.

References

[1] Pascal Vincent and Yoshua Bengio. Manifold parzen windows. In S. Thrun S. Becker and K. Obermayer, editors,Advances in Neural Information Process- ing Systems 15, pages 825–832, Cambridge, MA, 2003. MIT Press.

4

Références

Documents relatifs

Dans ces trois mode`les, les personnels de sante´ ont un risque significativement plus faible de de´ce´der comparativement aux ve´te´rans de la marine nationale, aux personnels

Then, we re-order the codebook with MM ordering criterion that correspond to total orderings and enable to obtain complete lattices: lexicographic ordering [16], lexicographic

To per- form a complete lattice learning with manifold learning, a vertex is associated to each input vector data and a neigh- borhood graph is constructed.. Then, we consider only

When a noun phrase is annotated, the borders of the head word are highlighted with HeadBegin and HeadEnd attributes, its grammatical form is normalized (reduced to

Figure 1: The MWS diagram Figure 2: The three vertical planes in the MWS To grasp the specific activity of students solving problems in mathematics, the two epistemological

Now if the true density that we want to model is indeed “close to” a non-linear lower di- mensional manifold embedded in the higher dimensional input space, in the sense that most

The next experiment is meant to show both qualitatively and quantitatively the power of non-local learning, by using 9 classes of rotated digit images (from 729 first examples of

In the announcement for his studio, Vey stated that he would do photographs, tintypes, picture framing, the “celebrated” $8.00 oil paintings in heavy gold frames, and that he