• Aucun résultat trouvé

Incremental and Decremental Training for Linear Classification

N/A
N/A
Protected

Academic year: 2022

Partager "Incremental and Decremental Training for Linear Classification"

Copied!
8
0
0

Texte intégral

(1)

Supplement Materials for “Incremental and Decremental Training for Linear Classification”

Cheng-Hao Tsai

Dept. of Computer Science National Taiwan Univ., Taiwan

[email protected]

Chieh-Yen Lin

Dept. of Computer Science National Taiwan Univ., Taiwan

[email protected]

Chih-Jen Lin

Dept. of Computer Science National Taiwan Univ., Taiwan

[email protected]

Details of Deriving

(25)

From Section 3.2, ∆Dα¯ is the dual initial solution. Then

dual initial value

=

l

X

i=k+1

h(∆Dαi,∆DC)−

1 2

l

X

i=k+1 l

X

j=k+1

2DαiαjK(i, j)−

l

X

i=k+1

(∆Dαi)2 d 2∆D

=∆D l

X

i=k+1

h(αi, C)

− ∆D

2

l

X

i=k+1 l

X

j=k+1

αiαjK(i, j) +

l

X

i=k+1

i)2 d

D

=∆D

1

2w¯Tw¯+C

l

X

i=k+1

ξ( ¯w;xi, yi)−1 2

k

X

i=1 k

X

j=1

αiαjK(i, j)

+1 2

l

X

i=k+1 l

X

j=k+1

αiαjK(i, j) +

l

X

i=k+1

i)2d

−∆D

2

l

X

i=k+1 l

X

j=k+1

αiαjK(i, j) +

l

X

i=k+1

i)2 d

D

=1

2w¯Tw¯+ ∆DC

l

X

i=k+1

ξ( ¯w;xi, yi)−∆D

2

k

X

i=1 k

X

j=1

αiαjK(i, j)

−∆2D−∆D

2

l

X

i=k+1 l

X

j=k+1

αiαjK(i, j) +∆D−1 2 w¯Tw.¯

The second and third equalities respectively use (31) and (23).

Experiments for Logistic regression with Larger

C

In the paper we present figures of comparing optimization methods using logistic regression withC= 1. Figures I and II give result of usingC= 128.

Experiments for L2-loss Support Vector Ma- chine

We have conducted experiments on L2-loss SVM by fol- lowing the same settings for logistic regression. Table I shows the relative difference between initial and optimal function values for incremental and decremental learning, re- spectively. The comparison of the three optimization meth- ods are shown in Figures III and IV forC= 1 and Figures V and VI forC= 128.

(2)

Data set Formulation C= 1 C= 128

wo-ws r= 5 r= 50 r= 500 wo-ws r= 5 r= 50 r= 500

ijcnn Primal 3.2e+00 3.3e−04 2.4e−05 1.7e−06 3.2e+00 3.4e−04 2.5e−05 1.7e−06

Dual 1.0e+00 1.9e−01 1.9e−02 1.3e−03 1.0e+00 1.9e−01 1.9e−02 1.3e−03

webspam Primal 2.9e+00 2.4e−04 2.6e−05 3.8e−06 3.0e+00 3.2e−04 3.8e−05 4.3e−06

Dual 1.0e+00 2.0e−01 2.0e−02 2.4e−03 1.0e+00 2.0e−01 2.0e−02 2.4e−03

news20 Primal 9.3e+00 1.8e−01 1.5e−02 1.3e−03 2.0e+02 5.1e+00 4.4e−01 3.3e−02

Dual 1.0e+00 1.6e−01 1.2e−02 1.0e−03 1.0e+00 2.3e−01 2.3e−02 2.1e−04

rcv1 Primal 1.3e+01 1.8e−01 1.4e−02 2.6e−03 3.1e+02 9.0e+00 7.5e−01 1.7e−01

Dual 1.0e+00 1.8e−01 1.4e−02 2.0e−03 1.0e+00 2.3e−01 2.6e−02 9.9e−04

real-sim Primal 1.5e+01 1.1e−01 1.1e−02 1.4e−03 1.5e+02 3.9e+00 4.5e−01 3.7e−02

Dual 1.0e+00 1.8e−01 1.7e−02 2.0e−03 1.0e+00 3.0e−01 3.3e−02 8.0e−04

yahoo-japan Primal 5.8e+00 1.3e−01 1.1e−02 1.1e−03 5.5e+01 4.6e+00 4.4e−01 4.9e−02

Dual 1.0e+00 2.2e−01 2.1e−02 2.3e−03 1.0e+00 2.7e−01 2.7e−02 2.9e−03

(a) Incremental learning

Data set Formulation C= 1 C= 128

wo-ws r= 5 r= 50 r= 500 wo-ws r= 5 r= 50 r= 500

ijcnn Primal 3.2e+00 3.3e−04 3.7e−05 2.5e−06 3.2e+00 3.4e−04 3.7e−05 2.6e−06

Dual 1.0e+00 6.8e−01 8.0e−02 3.5e−03 1.0e+00 8.5e+01 8.3e+00 4.4e−01

webspam Primal 2.9e+00 2.3e−04 3.4e−05 5.8e−06 3.0e+00 3.2e−04 5.1e−05 6.8e−06

Dual 1.0e+00 5.1e−02 9.6e−03 1.4e−03 1.0e+00 3.1e+00 7.6e−01 1.9e−01

news20 Primal 8.5e+00 7.7e−02 8.6e−03 7.4e−04 2.1e+02 1.0e−01 1.7e−02 3.2e−04

Dual 1.0e+00 9.1e−02 6.1e−03 3.3e−04 1.0e+00 1.6e+01 1.4e+00 1.6e−04

rcv1 Primal 1.3e+01 8.6e−02 8.1e−03 1.4e−03 3.2e+02 1.9e−01 2.4e−02 1.3e−03

Dual 1.0e+00 1.0e−01 7.5e−03 7.7e−04 1.0e+00 2.1e+01 2.9e−01 7.3e−04

real-sim Primal 1.5e+01 6.3e−02 7.8e−03 1.2e−03 1.7e+02 1.9e−01 1.8e−02 1.4e−03

Dual 1.0e+00 1.5e−01 1.6e−02 2.4e−03 1.0e+00 2.4e+01 1.3e−01 4.2e−02

yahoo-japan Primal 5.9e+00 7.5e−02 8.5e−03 8.7e−04 5.9e+01 2.0e−01 2.0e−02 2.8e−03

Dual 1.0e+00 2.0e−01 2.4e−02 2.8e−03 1.0e+00 3.2e+01 5.3e+00 2.4e−01

(b) Decremental learning

(3)

TRON DCD PCD

0 0.1 0.2 0.3 0.4 0.5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8 10

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 50 100 150 200

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100 120

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600 700

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 5 10 15 20 25 30 35

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1000 2000 3000 4000 5000

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 0.5 1 1.5 2 2.5 3

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1 2 3 4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100 120 140

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 2 4 6 8 10

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8 10 12 14

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100 120

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 50 100 150 200 250 300

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50 60 70

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1 2 3 4 5

x 104

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure I: Incremental learning: running time (in seconds) versus the relative objective value difference.

Logistic regression withC= 128 is used.

(4)

0 0.1 0.2 0.3 0.4 0.5

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1 2 3 4 5 6

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 50 100 150 200

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 5 10 15 20 25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1000 2000 3000 4000 5000

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 0.5 1 1.5 2 2.5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.5 1 1.5 2 2.5 3

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 2 4 6 8 10 12

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8 10 12

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 50 100 150 200 250

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50 60

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.5 1 1.5 2 2.5 3

x 104

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure II: Decremental learning: running time (in seconds) versus the relative objective value difference.

Logistic regression withC= 128 is used.

(5)

TRON DCD PCD

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.1 0.2 0.3 0.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1 1.2 1.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 10 20 30 40 50 60

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8 10

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 2 4 6 8 10 12

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.5 1 1.5 2

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 0.2 0.4 0.6 0.8 1 1.2 1.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1 1.2 1.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 0.5 1 1.5 2 2.5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.5 1 1.5 2 2.5 3 3.5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 20 40 60 80 100 120

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1 2 3 4 5 6 7

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 50 100 150

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure III: Incremental learning: running time (in seconds) versus the relative objective value difference.

L2-SVM withC= 1is used.

(6)

0 0.05 0.1 0.15 0.2 0.25

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 10 20 30 40 50

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 2 4 6 8

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 2 4 6 8 10 12

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1 1.2 1.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 0.2 0.4 0.6 0.8 1

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.05 0.1 0.15 0.2 0.25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 0.5 1 1.5 2

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.5 1 1.5 2 2.5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 20 40 60 80 100

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 1 2 3 4 5

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100 120 140

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure IV: Decremental learning: running time (in seconds) versus the relative objective value difference.

L2-SVM withC= 1is used.

(7)

TRON DCD PCD

0 0.05 0.1 0.15 0.2 0.25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1 1.2 1.4

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 50 100 150 200 250

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 200 400 600 800 1000

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 200 400 600 800 1000

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 50 100 150 200

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600 700

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 5 10 15 20 25 30

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25 30

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25 30

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 10 20 30 40 50

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 500 1000 1500 2000 2500 3000

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure V: Incremental learning: running time (in seconds) versus the relative objective value difference.

L2-SVM withC= 128is used.

(8)

0 0.05 0.1 0.15 0.2 0.25

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25 30 35

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 0.2 0.4 0.6 0.8 1

−8

−6

−4

−2 0

Training time (seconds)

Relative function value difference (log)

warm−start−r5 warm−start−r50 warm−start−r500

(a)ijcnn

0 50 100 150 200 250

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600 700

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(b)webspam

0 200 400 600 800

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 20 40 60 80 100 120 140

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600 700

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(c)news20

0 5 10 15 20 25 30 35

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 5 10 15 20 25

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(d)rcv1

0 10 20 30 40 50 60

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40 50 60 70

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 10 20 30 40

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(e)real-sim

0 500 1000 1500 2000 2500

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

0 100 200 300 400 500 600

−8

−6

−4

−2 0 2

Training time (seconds)

Relative function value difference (log)

no−warm−start warm−start−r5 warm−start−r50 warm−start−r500

(f)yahoo-japan

Figure VI: Decremental learning: running time (in seconds) versus the relative objective value difference.

L2-SVM withC= 128is used.

Références

Documents relatifs

As the implementation of big DNNs is problematic, proposed works in this context focus on inference process and assume that the training or learning process is done on an

In this paper we aim to perform classification analysis in a real dataset, using Apache Spark’s MLlib, a machine learning library optimized for distributed systems.. We

In the proposed approach, the potential uncertainty on right- contextual features is handled during voice training rather than during the synthesis process, as in the

Through theoretical analysis and practical experiments, we conclude that a warm start setting on a high-order optimization method for primal for- mulations is more suitable than

This basic operation (also called “background subtraction”or BS) consists of separating the moving objects called “foreground”from the static information called “background”.

We prove in Section 3 that our strategy is worst case optimal in order of magnitude for the number of critical stages criterion by constructing a scenario in which at least Ω(log

distinguished in the proposed approach: creating a new ontology without background knowledge (an initial search process with any domain ontologies according to user request)

We propose different methods based on Cross-Modality deep learning of CNNs: (1) a correlated model where a unique CNN is trained with Intensity, Depth and Flow images for each