# Upper and Lower Bounds on the Performance of Kernel PCA

(1)

(2)

1

m

d

i

j

i,j

## arXiv:2012.10369v1 [cs.LG] 18 Dec 2020

(3)

1

### Outline. We introduce our notation and recall existing theoretical results on kernel PCA in Section 2. Section 3 contains two new PAC bounds for kernel PCA, and Section 4 is devoted to two new PAC-Bayes bounds, along with our proofs. The paper closes with a brief illustration of the numerical value of our bounds (Section 5) and concluding remarks (Section 6). We gather proofs of technical results in Section 7.

1. We refer the reader toHein and Bousquet(2004) orHofmann et al.(2005) for an introduction to RKHS and their uses in machine learning.

(4)

m×n

d

1

m

m

1

m

m

1

1

m1

m

i=1

Xi

m

m

ν

X∼ν

X

1

1/2

V

v

span{v}

0

1

π

f∼π

F

1

1

2

2

1

1

2

1

n

1

n

n

i,j=1

i

j

i

j

1

n

T

n

2

2

d

2

X

x

x

x

x

x

1

2

1

2

H

(5)

i

i

i

j

j

j

V

i,j

i

j

i

j

1

m

m

m×m

i

j

i,j

i

j

i,j

(6)

1

m

1

V

2

2

V

i,j

i

j

i

j

H

2

ν

X

0

0

0

µ

i

i

µˆ

1

2

1

m

1

2

i

i

i

i

i

ˆλmi

i

i≥1

≤k

k

i=1

i

>k

m

i=k+1

i

1

m

m×m

m

i=1

i

i

0

0

(7)

1

m

ν

ν

ν

X

v

X

ν

ν

µ

µˆ

k

k

k

i

i

i

ν

i

k

k

ν

ν

uk

2

dim(V)=k

06=v∈V

ν

v

2

≤k

ν

dim(V)=k

ν

V

2

>k

ν

dim(V)=k

ν

V

2

µ

µˆ

µˆ

Vˆ

k

2

m

i=1

Vˆ

k

i

2

k

i=1

i

µ

Vk

2

k

i=1

i

(8)

k

>k

i>k

i

k

>k

µ

ˆ

Vk

2

1≤`≤k

>`

m

i=1

i

i

2

2

k

≤k

µ

Vˆ

k

2

1≤`≤k

≤`

m

i=1

i

i

2

2

k

k

µ

Vˆ

k

2

µ

ˆ

Vk

2

≤k

>k

>k

≤k

(9)

1

2

1

1

m

2

m+1

2m

k

1

µ

Vˆ

k(S1)

2

m

i=1

Vˆ

k(S1)

m+i

2

2

k

1

1

1

1

2

1

1

m

2

m+1

2m

µ

Vˆ

k (S1)

2

m

i=1

Vˆ

k (S1)

m+i

2

2

k

1

k

1

1

1

1

n

n

1

i−1

i+1

n

xi,ˆxi

1

i−1

i

i+1

n

1

i−1

i

i+1

n

i

(10)

1

n

1

n

2

n

i=1

2i

1

2

1

k

1

1

1

µ

Vˆ

k(S1)

2

µ

Vˆ

k(S1)

2

µ

Vˆ

k(S1)

2

m

i=1

Vˆ

k(S1)

m+i

2

m

i=1

Vˆ

k(S1)

m+i

2

1

2

S2

m

i=1

Vˆ

k(S1)

m+i

2

µ

Vˆ

k(S1)

2

µ

Vˆ

k(S1)

2

m

i=1

Vˆ

k(S1)

m+i

2

2

Vˆ

k(S1)

k(S1)

(11)

µ

Vˆ

k

2

≤k

α

4

1−α

0

12

2 log(m)1

2 log(1/δ)

R4

µ

Vˆ k

2

>k

α

4

1−α

0

12

2 log(m)1

2 log(1/δ)

R4

d

m

+

2

i

m−1

i

(i)

(12)

m

i=1

i

(i)

(i)

1

i−1

i+1

m

+

−1+

+

1

m

1

m

−1+

s(Z−E[Z])

2

+

+

1

m

F

z∼µ

s

m1

m

i=1

i

d

m

m

F

F

1

m

m

m

m

m

m

s

f∼Qs

s

s

f∼Qs

s

1

m

S

S

S

m

(13)

2

0

m

S∼µm

f∼Q0S

S

m

m

S

S

S

0S

H

NH

NH

2

Vk

2

N

2

H

Vˆ

2

N

2

H

2

2

0

Vˆ

2

0

0

NH×k

NH×NH

0

N

2

H

Vˆ

2

NH

i,j=1

i,j

i

j

NH

i,j=1

i,j

i,j

(14)

i

i

j

i,j=1..NH

N

2

H

i

i

2

NH

i,j=1

2i,j

0

2

i∈I

i

0i

j∈I

j

0j

i,j∈I

0i

j

2

2

2

2

NH

i=1

i

i

2

NH

i,j=1

i

i

j

j

NH

i,j=1

i

j

i

j

Vk

2

X

N

2

H

NH

k

w

NH

w

V

2

H

k

w

NH

w

V

2

w

(15)

2

w

2

w

NH

w

w

2

V

2

>k

>k

dim(V)=k

µ

V

2

µ

V

k

2

k

k

m

Fk

k

wˆk(s)

k

wˆ

k(s)

Fk

k

k

NH2

wˆk

Vˆ

k(s)

2

k

k

NH2

wˆ

k

Vˆ

k (s)

2

k

k

k

k

(16)

µ

Vˆ

k

2

k

i=1

i

α

4

1−α

0

12

2 log(m)1

2 log(1/δ) R4

α

k

α

Sk

Sk

S

Sk

Sk

S∼µm

f∼PSk

α

S

Sk

wˆ

k(S)

Vˆ

k (S)

2

Sk

wˆ

k(S)

kS

S

S

wˆ

k(S)

α

wˆ

k(S)

S

wˆ k(S)

α

µ

2

Vˆ

k (S)

2

µˆ

2

Vˆ

k (S)

2

2

Vˆ

k (S)

2

Vˆ

k(S)

2

µ

Vˆ

k(S)

2

µˆ

Vˆ

k(S)

2

α

µˆ

Vˆ

k (S)

2

m1

k

i=1

i

µ

Vˆ

k(S)

2

k

i=1

i

α

S∼µm

f∼Pk

S

α

S

S∼µm

α

wˆ

k

S

wˆ

k

S∼µm

α

µ

Vˆ

k(S)

2

µˆ

Vˆ

k(S)

2

(17)

µ

Vˆ

k(S)

2

dim(V)=k

µ

V

2

µ

Vk(S)

2

µˆ

Vˆ

k(S)

2

dim(V)=k

µˆ

V

2

µˆ

Vk

2

µ

Vˆ

k(S)

2

µˆ

Vˆ

k(S)

2

µ

Vk

2

µˆ

Vk

2

Vˆ

k(S)

Vk

m1

m

i=1

i

i

Vk

i

2

i

µ

Vk

2

µˆ

Vk

2

S

µ

Vˆ

k(S)

2

S

µˆ

Vˆ

k(S)

2

S∼µm

α

S

m

2

m

i=1

2

Vk

i

2

1

m

m

1

m

S

Rm2

S

i

(i)

2

j6=i

2

Vk

i

2

(i)

1

i−1

i+1

m

m−1

m

Vk

i

2

2

i

(i)

2

Vk

i

2

2

m

m

i=1

i

(i)

m

i=1

2

Vk

i

2

2

(18)

m

+

S

s(Z−ES[Z])

1

3

S

2m3

2

S

mα(ES[Y]−Y)

S

R

2

m1−α(Z−ES[Z])

1

3

S

2m3

4

2−2α

4

1−2α

S

α

4

1−α

R,δ

log(1/δ)mα

mR1−α4

µ

Vˆ k

2

m

i=k+1

i

α

4

1−α

0

12

2 log(m)1

2 log(1/δ) R4

k

α

k

α

kS

S

kS

kS

kS

S∼µm

f∼Qk

S

α

S

kS

wˆk(S)

Vˆ

k(S)

2

kS

wˆk(S)

kS

S

S

wˆk(S)

α

S

wˆk(S)

wˆk(S)

(19)

α

µˆ

2

Vˆ

k(S)

2

µ

2

Vˆ

k(S)

2

2

Vˆ

k(S)

2

Vˆ

k (S)

2

µ

Vˆ

k (S)

2

µˆ

Vˆ

k (S)

2

α

µˆ

Vˆ

k(S)

2

m1

m

i=k+1

i

x∼µ

Vˆ

k (S)

2

m

i=k+1

i

α

S∼µm

f∼QkS

α

S

S∼µm

α

S

k

k

S∼µm

α

µˆ

Vˆ

k (S)

2

µ

Vˆ

k (S)

2

µ

Vˆ

k (S)

2

dim(V)=k

µ

V

2

µ

V

k (S)

2

µˆ

Vˆ

k (S)

2

dim(V)=k

µˆ

V

2

µˆ

V

k

2

µˆ

ˆ

Vk(S)

2

µ

ˆ

Vk(S)

2

µˆ

V

k

2

µ

V

k

2

k(S)

Vk

m1

m

i=1

i

i

V

k

i

2

i

µˆ

V

k

2

µ

V

k

2

S

µ

Vˆ

k (S)

2

S

µˆ

Vˆ

k (S)

2

(20)

S∼µm

α

S

m

2

m

i=1

V

k

i

2

1

m

m

1

m

Y−S

Rm2

S

i

(i)

2

j6=i

Vk

i

2

(i)

1

i−1

i+1

m

m−1

m

V

k

i

2

2

i

(i)

V

k

i

2

2

m

m

i=1

i

(i)

m

i=1

V

k

i

2

2

m

+

S

s(Z−ES[Z])

1

3

S

2m3

2

S

mα(ES[Y]−Y)

S

R

2

m1−α(Z−ES[Z])

1

3

S

2m3

4

2−2α

4

1−2α

S

α

4

1−α

R,δ

log(1/δ)mα

mR1−α4

Updating...