Mathematical derivation of LocalGaussian computation for Manifold Parzen, and errata
Pascal Vincent
D´epartement d’Informatique et Recherche Op´erationnelle Universit´e de Montr´eal
P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7, Qc, Canada vincentp@iro.umontreal.ca
Technical Report 1259
D´epartement d’Informatique et Recherche Op´erationnelle March 16, 2005
Abstract
The aim of this report is to correct an inconsistency in the mathematical formulas that appeared in our previously publishedManifold Parzenarticle [1], regarding the computation of the density of “oriented” high dimensional Gaussian “pancakes”
for which we store only the firstdleading eigenvectors and eigenvalues (rather than a fulln×ncovariance matrix, or its inverse). We give a detailed derivation leading to the correct formulas.
1 Detailed mathematical derivation of LocalGaussian evaluation
We consider, inIRn, the multivariate Gaussian densityNµ,C parameterized by mean vectorµ ∈ IRnandn×ncovariance matrixC. The density at any pointx ∈ IRnis given by
Nµ,C(x) = 1
p(2π)n|C|e−12(x−µ)0C−1(x−µ) (1) where|C|is the determinant ofC.
Letx˜=x−µ
LetC = V DV0 the eigen-decomposition ofC where the columns ofV are the or- thonormal eigenvectors andDis a diagonal matrix with the eigenvalues sorted in de- creasing order(λ1, . . . , λn).
1
We replaceDwithD˜which is a diagonal matrix containing modified eigenvaluesλ˜1..n
such thatall eigenvalues afterλ˜dare given the same valueσ2, i.e.
˜λ1..n = (˜λ1, . . . ,λ˜d, σ2, . . . , σ2)
UsingC˜ =VDV˜ 0instead ofCin equation 1, we get:
N˜µ,C(x) = 1 q
(2π)n|C|˜
e−12˜x0C˜−1˜x
= e−12log((2π)n|C|)˜ e−12˜x0C˜−1˜x
= e−0.5(nlog(2π)+log(|C|)+˜˜ x0C˜−1x)˜
(2)
In other words N˜µ,C(x) =e−0.5(r+q) (3)
with r=nlog(2π) +log(|C|)˜ (4)
and q= ˜x0C˜−1x˜ (5)
Moreover, sinceV is an orthonormal basis, we have kV0xk˜ 2 = kxk˜ 2
n
X
i=1
(Vi0x)˜ 2 = kxk˜ 2 (6)
whereVi0x˜is the usual dot product between theith eigen-vector and centered inputx.˜
Proof : kV0xk˜ 2 = (V0˜x)0(V0x)˜
= ˜x0V V0˜x
= ˜x0Ix˜
= k˜xk2
In addition, having the above eigendecomposition,
we have |C|˜ =
n
Y
i=1
˜λi
thus log(|C|)˜ =
n
X
i=1
log(˜λi)
2
log(|C|)˜ = (n−d)log(σ2) +
d
X
i=1
log(˜λi) (7)
Replacing 7 in 4, we get
r=nlog(2π) + (n−d)log(σ2) +
d
X
i=1
log(˜λi) (8)
In addition we have q = x˜0C˜−1x˜
= x˜0(VDV˜ 0)−1x˜
= x˜0VD˜−1V0x˜
=
n
X
i=1
1 λ˜i
(Vi0x)˜ 2
=
n
X
i=1
(1
˜λi
− 1
σ2)(Vi0x)˜ 2
! + 1
σ2
n
X
i=1
(Vi0x)˜ 2
Since˜λi =σ2for alli > d,(˜1
λi −σ12) = 0fori > d. As a consequence the first sum can be replaced by a sum from1tod(instead of from1ton). Also from equation 6 the second sum can be replaced byk˜xk2. This yields:
q= 1
σ2kxk˜ 2+
d
X
i=1
(1 λ˜i
− 1
σ2)(Vi0x)˜ 2 (9)
We have thus eliminated the need to store and compute with eigenvectorsVd+1. . . Vn.
2 Summing up: the correct formulas
To sum this all up, we can compute the density as follows:
N˜µ,C(x) =e−0.5(r+q) (10)
with r=nlog(2π) + (n−d)log(σ2) +
d
X
i=1
log(˜λi) (11)
and q= 1
σ2k˜xk2+
d
X
i=1
(1 λ˜i
− 1
σ2)(Vi0x)˜ 2 (12)
3
3 Errata for Manifold Parzen article
Typo in the pseudo-code forMparzen::trainstep 4) should beλij = σ2+ s
2 j
k
instead ofσ2+s
2 j
l.
Also, in our initial experiments, we actually considered two possible choices forλ˜i(for i≤d) andσ2:
a)(˜λ1, . . . ,λ˜d) = (λ1, . . . , λd)andσ2=λd+1 which leads to:
r = nlog(2π) + (n−d) log(σ2) +
d
X
i=1
log(λi)
q = 1
σ2k˜xk2+
d
X
i=1
(1 λi − 1
σ2)(Vi0x)˜ 2
b)σ2is a user specified value and(˜λ1, . . . ,λ˜d) = (λ1+σ2, . . . , λd+σ2) which leads to:
r = nlog(2π) + (n−d)log(σ2) +
d
X
i=1
log(λi+σ2)
q = 1
σ2k˜xk2+
d
X
i=1
( 1
λi+σ2− 1
σ2)(Vi0x)˜ 2
We mentioned only scenariob)in the Manifold Parzen article [1] (due to space con- straints). But somehow these two slightly different versions got mixed up in the write- up, leading to the somewhat inconsistent formulas that appear in the article (taking r from b) and q from a)). In addition, we mistakenly wrote dlog(2π)instead of nlog(2π).
However, after verification, the actualcodeused to perform the experiments reported in the article (implementing scenariob)) appears correct.
References
[1] Pascal Vincent and Yoshua Bengio. Manifold parzen windows. In S. Thrun S. Becker and K. Obermayer, editors,Advances in Neural Information Process- ing Systems 15, pages 825–832, Cambridge, MA, 2003. MIT Press.
4