# University of Liège Faculty of Applied Sciences Department of Electrical Engineering & Computer Science PhD dissertation

(1)

L G L G

(2)
(3)

(4)
(5)

(6)
(7)

(8)
(9)

(10)
(11)

L

(12)

(13)

s

f

(14)
(15)

(16)

(17)

(18)

(19)

(20)
(21)

(22)
(23)

## 2

1

2

p

j

j

j

1

2

p

1

p

1

### . By definition, both the input and the output spaces are assumed to respectively contain all possible input vectors and all possible output values. Note that input variables are some-

1 Unless stated otherwise, supervised learning is reduced to the prediction of a sin- gleoutput variable. More generally however, this framework can be defined as the prediction of one or several output variables.

(24)

j

j

j

j

j

j

1

1

N

N

i

i

i

i

j

1

N

(25)

1

2

J

1

2

J

1

p

(26)

1

p

2

L

L

3

L

X,Y

L

L

L

4

L

L

L

L

X,Y

L

L

L

L

2

L

X,Y

L

2

### } ( 2 . 3 )

2 Unless it is clear from the context, models are now denotedϕLto emphasize that they are built from the learning setL.

3 EX{f(X)}denotes the expected value off(x)(forx∈X) with respect to the probability distribution of the random variableX. It is defined as

EX{f(X)}= X

x∈X

P(X=x)f(x).

4 1(condition)denotes the unit function. It is defined as

1(condition) =



1 ifconditionis true

0 ifconditionis false .

(27)

L

L

L

0

L

0

L

0

L

L

0

0

(xi,yi)∈L0

i

L

i

0

0

L

L

train

L

L

L

train

test

L

test

Ltrain

train

test

L

Ltrain

test

train

test

train

(28)

test

train

test

L

Ltrain

1

K

k

L\L

k

CV

L

K k=1

L\Lk

k

L\L

k

L

L\L

k

k

L

L

L

L

L

L

L

B

(29)

X,Y

B

X

Y|X

B

B

y∈Y

Y|X=x

B

B

B

B

L

B

y∈Y

Y|X=x

y∈Y

y∈Y

1

2

J

B

y∈Y

Y|X=x

2

Y|X=x

L

L

L

B

B

B

B

L

L

L

(30)

B

L

L

B

L

L

B

L

B

B

B

ϕ∈H

B

(31)

5

L

θ

θ

test

θ

train

test

### As Figure 2 . 1 illustrates, this phenomenon can be observed by ex- amining the respective training and test estimates of the model with respect to its complexity. When the model is too simple, both the train- ing and test estimates are large because of underfitting. As complexity increases, the model gets more accurate and both the training and test estimates decrease. However, when the model becomes too complex, specific elements from the training set get captured, reducing the cor- responding training estimates down to 0, as if the model were perfect.

5 IfA makes use of a pseudo-random generator to mimic a stochastic process, then we assume that the corresponding random seed is part ofθ.

(32)

Mode l com ple xity 0

5 10 15 20 25 30

Meansquareerror

Training e rror Te s t e rror

Ltrain

Ltest

100

Ltrain

Ltest

2005

test

train

test

(33)

test

test

train

valid

test

train

valid

θ

test

train

valid

θ

train

valid

train

valid

test

1

K

k

−k

−k1

−kK

−k

k

θ

CV

−k

θ

K l=1

−k

−kl

−kl

(34)

k

k

k

K k=1

k

k

k

θ

CV

θ

K k=1

k

k

1

p

p j=1

j

j

1

2

1

p

j=1

j

j

2

(35)

j

j

(36)

w,ξ

2

N i=1

i

i

i

i

i

α

N i=1

i

i,j

i

j

i

j

i

j

i

N i=1

i

i

i

i

j

i

j

i

j

j

j

n i=1

ij

i

(37)

wij x1

x2

x3

h1

h2

h3

h4

h5

1

p

j

5

4 j=1

j5

j

p i=1

ij

i

ij

(38)

### X

(xi,yi)∈NN(x,L,k)

i

c∈Y

### X

(xi,yi)∈NN(x,L,k)

i

(39)

(40)

1

2

J

c1

c2

cJ

ck

k

ϕc1

ϕc2

ϕcJ

ϕck

k

B

ϕc1B

ϕc2B

ϕcJB

Updating...

## References

Related subjects :